and Me

It’s been a month since I have been contributing on My project is to import endpoints used by bundler to install gems to  Did you know that

Fetching gem metadata from

is kind of a lie? It is actually talking to and that is a sinatra app (bundler-api) different from rails app of didn’t have the infrastructure to handle all the requests coming from all the Rubiest in the world running bundle install  that is why  dependency endpoint used by bundler had to be served from a different app. Now with the help of Ruby Together we have the resources, so we would really like it if we don’t have two applications to maintain. It is so not fun when people can’t find new version releases.

My project got a jump start because #1225 was merged. I only had to make a few changes so that the response matched the one of bundler-api. In the process, I removed some of the things we weren’t using anymore: rubyforgers and version_histories tables and downloads column from rubygems table. While the former two had not been used for a while, the downloads column had gone out of service fairly recently. The way tracks downloads of gems is rather interesting. If you ask me, how I would track number of downloads, I would probably just say that I would keep a field count for it and increment it every time someone downloaded a gem. Given that bundle-api handles 5-7k requests per minute, that won’t be a good idea. I know the request stats because Nick (@qrush)  gave me access to New Relic app of rubygems❤ Thanks Nick! Data is beautiful, indeed.

Screenshot from 2016-05-28 22-59-36

What does is that, it updates downloads in bulk. Gems are served over Fastly CDNs, and process the log files generated by them every minute (For details checkout: Simplifying our stack). There is more… We track downloads by version. So, how would you find out all the downloads of a gem? I would probably say that I would sum the downloads of  all the versions of a gem with an activerecord query. What does is that, during the bulk update it also updates the version_id: 0 row with:

increment(count, rubygem_id: rubygem_id, version_id: 0)

Now, we don’t have to sum over 709 versions of caboose-cms and it is simple fetch of a row. Following the same pattern, it keeps track of the total count of all the gem downloads with:

increment(total_count, rubygem_id: 0, version_id: 0)

My contribution related to all this was that now you won’t see show all versions link below downloads stats on gem show page.

Screenshot from 2016-05-28 22-00-45

I guess I haven’t made any significant change so far. David (@dwradcliffe) has suggested that I pick up metadata migration. That has me excited! I will be changing the way people build gems.. at least I will be writing the code that will make it happen. We also have plans of adding features to our search system. We are using elasticsearch on, I am pretty sure it can do much more than basic search we have right now.

Lastly, I would suggest that you should checkout upsert. I got it know about it through Arthur (@arthurnn), SQL seems to be his thing. It was introduced only in pg 9.5 and I think it’s really cool. I can’t wait until we use it on and Me

Running Tests Should be Fast!

Let me name you some of the worst things in world: Final season of Two and a Half Men, stubbing your toe at night, text on meme which is too small to read and test suites where you have to wait for 5+ seconds before any of your tests run.


On OSEM, I wanted to improve the test coverage, which is Coverage Status. Red is not really my favorite color (unless it is in a git diff).  If you don’t see red anymore, you know what happened😉

It was really slow to run tests. I introduced spring, and it did save me file load time but it wasn’t good enough. The aha moment was when I got:

The following factories are invalid: (FactoryGirl::InvalidFactoryError)event_commercial – Validation failed: Network is unreachable – connect(2) for “” port 80 (ActiveRecord::RecordInvalid)

I have gotten that error for a total of one time. Which kind of makes me worry about how less often I stay offline.

You would think we must have introduced webmock and moved on. But wait! There is more. I also found out that we can just decouple running FactoryGirl lint from the test suite. Until now, the lint was running before test suite:

As the Readme of FactoryGirl suggested, I moved lint to rake task and added it to travis. In my head, it made lot of sense. For me, running FactoryGirl lint before every suite is like running rubocop before every request with localhost:3000. As issue #772 on FactoryGirl, very rightly points out.. what if I am doing TDD?  No! TDD is not dead. I am not saying, I use TDD, still running complete lint just when I am running single test is an overhead I can’t bear. May be it just amounts for a couple hundred milliseconds, but those are the milliseconds I shouldn’t have to lose.

Some of the valid arguments against my opinion was that we will potentially be writing tests on broken factories and now we will have to run lint manually. It would indeed suck to write 200 lines  of tests only to find out that your factories are not correct. I think it can be solved to an extent by integrating lint with pre-commit hook.

In the end we were able to find a common ground by moving FactoryGirl lint behind an ENV variable:

What would be value of your ENV[OSEM_FACTORY_LINT]?

Running Tests Should be Fast!

Why use GitHub?

A few days back, I had the honor to introduce someone to GitHub. What has me wondering is that I didn’t have to do it for a newcomer rather I explained what is GitHub to a someone who has been in software development field for quite a while.

Are there real people out there who haven’t heard the word “GitHub” before? Following is what I replied to the person of interest:


“Please also belabor for me in as many words as your will prefers, why it will be worth my time to find more about GitHub.”

This should be interesting. The basic would be that it is remote git hosting service. Each repository is given a project space. Each project has a bug tracker associated with it, a space for pull requests (git patch), milestones, tasks, changelog etc. Since public projects are free to host on GitHub, it has become de-facto hosting platform for open source projects.
Why should you care? Every organization needs a tool for all its developers to collaborate. GitHub can be that tool for you. On GitHub you can create an organization space and have your developers be part of it. You can define permission levels within the organization . Projects can have permission levels as well. So basically if your organization doesn’t want to deal with hassle of maintaining a git server and a collaboration tool on top of it then you can just let GitHub manage it all. BTW, if you are one of those miserable souls who have to use SVN, GitHub supports that as well.


Illustration by jeejkang
Okay, enough of mundane talk. Let me introduce you to interesting stuff. GitHub is place we developers hang out. You can follow others and see what they are working on, that is really cool cause everyone is on GitHub. You can follow the developers you worship and see what they have worked on and if they inspire you enough, you can work along side them (cause open source!!).  It is also the place for the happening things. Think of any new technology which has come in last 5 years, it is most likely that is being developed on GitHub. Angular.Js, Jquery, Bootstrap, Go, RustRails, Reacthomebrew, node and everything else, all of it is on GitHub.
Moving on to libraries and frameworks!! All the awesome code others have already written, repositories we love❤ Have you really never dealt with a bug in library you wanted to use? Using GitHub, you can ask the maintainers of that library to fix the bug for you. In fact they will appreciate that you took the time to report it. redis, emscripten from C, httpie, thefuck from Python, elasticsearch, RxJava from Java, tensorflow, mongo from C++, jekyll and devise from Ruby and once again list goes on. All of them are developed on GitHub.


I know GitHub is much more than what I wrote. I missed the platform it provides for publishing your awesome hack: HomeMirror, building courses for everyone: FreeCodeCamp, collection of free books: free-programming-books,  a command-line murder mystery : clmystery, all German federal laws and regulations (seriously?): gesetze. I really don’t think I will ever be able to describe this wonderland and not miss something really cool. If there are more of you out there, I hope you will find it useful and you will take GitHub for a spin. We love GitHub, I hope you will too. I really can’t it say it any better than what James said in his Dear GitHub post:


“Dear GitHub,

You have done so much to grow the open source community and make it really accessible to users. Somehow you have us chasing stars and filling up squares, improving the world’s software in the process.”

Why use GitHub?

Bug which depended on other bug in a different library

EDIT: I had only figured out part of the problem. You can read more detailed explanation here.

I was finally able to solve bundler error on Spring. Yay me!

It all started when bundler began cleaning ENV[“RUBYLIB”] and the change was released with bundler 1.11.0. Spring failed to `require bundler/setup` cause it was depending on bundler leaving that RUBYLIB path uncleaned. That require wasn’t suppose to fall back to RUBYLIB path in the first place, it was suppose to find it in GEM_PATH. However, GEM_PATH was empty string when one chooses to change the default bundle install path for their app. I don’t know yet if that a desired behavior, I guess I will look into it.

I am glad I took the time to figure it out. Now, I have much better understanding of how the ENV and $LOAD_PATHS work. Hey, did you know that ENV.delete(“key”) returns the value of deleted key? Also, the subtle difference between `dup` and `clone`? This example from ruby doc sums it well:

class Klass
  attr_accessor :str

module Foo
  def foo; 'foo'; end

s1 = #=> #<Klass:0x401b3a38>
s1.extend(Foo) #=> #<Klass:0x401b3a38> #=> "foo"

s2 = s1.clone #=> #<Klass:0x401b3a38> #=> "foo"

s3 = s1.dup #=> #<Klass:0x401b3a38> #=> NoMethodError: undefined method `foo' for #<Klass:0x401b3a38>

You should check out RailsConf 2015 — Breaking Down the Barrier: Demystifying Contributing to Rails from Eileen Uchitelle. She explains use of `caller` and `tracepoint`. I found them really useful while understanding the flow of control in Spring. She also demonstrates use of `git bisect`. Turns out, it is not as intimidating as I though it would be.

Bug which depended on other bug in a different library

I contributed on rails

Not really? Well, I contributed on rails/spring. It is gem which rails app use to preload its files so that every time you run rails console or tests, you don’t have to wait until all the files are loaded. I have never worked with threads before, but again everything has a first. In fact, it is also the first  gem repository whose code I have read end to end. I have contributed on other gems but I just read the class or module I was working on.

Michael Grosser (@grosser) was really helpful and prompt with review of my PRs. Both of my PRs has gone through a lot of scrutiny, still I am happy that atleast one of them is merged. While reading the gem, I realized that rails had kept my exposure limited. In rails one doesn’t have to deal with attr_accesible, require, dependencies etc. You can be as sloppy as want and still everything will work.
Even after all the scrutiny, my PR managed to break the gem functionality. Have you ever used Bundler.setup? $LOAD_PATH? A few more things which you would probably never use on a normal rails app. Apparently `require bundler/setup` is extensively used in gems. It checks your gemfile and overrides the $LOAD_PATH with things it found there.
This is why I feel overwhelmed all the time. I am always afraid that I will break something, that I don’t know what I am doing. I start feeling that Michael and Jon must be think that I am idiot and I am wasting their time. Time they spent reviewing my PR, they could probably made the change themselves. Least I can do is be thankful to them for putting up with me.

When people say that you should read others code, they are right. You get to learn about different coding styles and many other cool things. For example, I found out that ActiveSupport has this `strip_heredoc` method which I could have used when I was testing markdown on glittergallery. I had just written a lot of plus (rubocop, max-LineLength: 80) and a lot of `\n`.

File.write(path_to_file.rb, <<-RUBY.strip_heredoc)
    class Foo
        def self.omg
            raise "omg"

Above code will write content between <<-RUBY… RUBY in file you mentioned in well-formatted manner.


I contributed on rails

GlitterGallery now running on VPS

Project: GlitterGallery

We recently completed our work on implementation of push over ssh protocol. Do check out our demo: glittergallery-dev and report any issues you come across. You can find implementation details in PR #303. Now users can add ssh keys in their profiles and save themselves hassle of entering credentials every time they push. It also means that now we support sparkleshare. Following are the steps you need to follow:

  • Go to settings -> click ssh key tab -> add a name of your key and paste your sparkleshare key -> click on add key
  • Make a new project
  • Open sparkleshare and go to add hosted projects. In address type: ssh:// and in remote path: /<username>/<project_name>.git
  • Click add and wait. Your repo will be synced in a moment.

#293 Nginx & Unicorn from Ryan was a great resource on setting up the VPS.

Right now, we are working on adding diff styles of design-with-git. We might not be using the code of design-with-git, just the ideas. We are focusing on opacity, mask, toggle and side view. I will keep you guys posted.

GlitterGallery now running on VPS

Git over ssh

Project: GlitterGallery

We had a minor set back with implementation of git protocols. I worked on git https protocol but later I found out that sparkleshare only supports ssh protocol. Until now we were planning to host on openshift. I needed access of ~/.ssh/authorized_keys file for git ssh to work but OS doesn’t give away that access. Time to move to VPS. Kevin got me set up with one and Pingou helped me figure out a few details.

First I needed to make changes to our web interface so that users can add their public key to their profiles. This would also mean addition of a keys model and generation of fingerprint for keys. Next thing is validation of keys when push or pull is made over ssh. This involves two steps namely, authentication and authorization. OpenSSH server handles the authentication part and for authorization I have set up git shell, which makes an api call to glittergallery to check user access. Besides authorization git shell also limits ssh access to git related commands.

Git shell I am using is just a fork of gitlab-shell. I am hoping that I won’t need to make any changes to it, however we won’t be supporting all the features (git-annex and git-lfs) of gitlab-shell yet.

Git over ssh