Sourcing & Recruitment Info

Tips | Hacks | Tricks | Ideas | Information | Innovation | Opinion

Sourcing on GitHub

This is a split from my post on how you should use GitHub and StackOverflow in technical recruitment. In this piece I am going to cover how to:

  1. utilize the built-in search engine of GitHub to find software engineers,
  2. X-ray GitHub to find software engineers,
  3. find developers based on their code  contribution, and
  4. contact the potential candidates you have found.

1. Searching for GitHub users

GitHub has a pretty neat and advanced built-in search engine here: https://github.com/search/advanced.

The Boolean operators AND, OR and NOT are supported, but the maximum number of operators you can use is 5. Some searches will end prematurely when the server times them out, but I have only experienced this when listing millions of results. You can recognize this timeout if you see this exclamation mark.

too long

An other limitation of this engine is, that you have to fit in 128 characters.

Although finding users is not the primary usage of this search engine, it’s pretty easy to run a search for users from a certain location, programming in a certain language — they for sure sound like a potential candidates :). To start, scroll down in the search site linked above, type in a location and hit the Search button.

users

Although the given example of San Francisco, CA implies that adding the state (US) or country (rest of the world) after the name of the city is the best way to use this field, that is actually not true. Searching for San Francisco, CA will give you user profiles who have set up either San Francisco OR CA as their location  — as seen on the picture below.

frisco1

This is still useful, but certainly not what you would expect with this writing format. So if you are determined to only see the results who describe the location name with adding the state/country (this can be pretty useful if you are for example looking for people from York, UK) you will have to change this:

location:”San Francisco” location: CA

to this:

location:”San Francisco,CA”

Note: This editing can not be done on the original search window, just on the results page. Keep in mind that some people fill out the location field just with the country, while some others fill out the city. It’s best to make sure you have searched for both options.

The next step is to narrow down users who code in a specific language. The filters on the left side can help you choose from the most popular languages in the area, but with a bit of URL editing you can essentially choose any programming language.

languager

If you want a programming language which is not present on the list, just change the ‘Python’ in the URL — for example for Visual Basic software engineers in London use this URL:

https://github.com/search?l=Visual Basic&q=location%3A”london”&ref=searchresults&type=Users&utf8=✓

The spaces will be filled out automatically. GitHub classifies the repositories into the following languages — essentially this is your menu.

LanguageWhat to write
ActionScriptActionScript
CC
C#C%23
C++C%2B%2B
ClojureClojure
CoffeeScriptCoffeeScript
CSSCSS
GoGo
HaskellHaskell
HTMLHTML
JavaJava
JavaScriptJavaScript
LuaLua
MatlabMatlab
Objective-CObjective-C
PerlPerl
PHPPHP
PythonPython
RR
RubyRuby
ScalaScala
ShellShell
SwiftSwift
TeXTeX
VimLVimL
ABAPABAP
AdaAda
AgdaAgda
AGS ScriptAGS Script
AlloyAlloy
AMPLAMPL
Ant Build SystemAnt Build System
ANTLRANTLR
ApacheConfApacheConf
ApexApex
APLAPL
AppleScriptAppleScript
ArcArc
ArduinoArduino
AsciiDocAsciiDoc
ASPASP
AspectJAspectJ
AssemblyAssembly
ATSATS
AugeasAugeas
AutoHotkeyAutoHotkey
AutoItAutoIt
AwkAwk
BatchfileBatchfile
BefungeBefunge
BisonBison
BitBakeBitBake
BlitzBasicBlitzBasic
BlitzMaxBlitzMax
BluespecBluespec
BooBoo
BrainfuckBrainfuck
BrightscriptBrightscript
BroBro
C-ObjDumpC-ObjDump
C2hs HaskellC2hs Haskell
Cap’n ProtoCap’n Proto
CartoCSSCartoCSS
CeylonCeylon
ChapelChapel
ChucKChucK
CirruCirru
CleanClean
CLIPSCLIPS
CMakeCMake
COBOLCOBOL
ColdFusionColdFusion
ColdFusion CFCColdFusion CFC
Common LispCommon Lisp
Component PascalComponent Pascal
CoolCool
CoqCoq
Cpp-ObjDumpCpp-ObjDump
CreoleCreole
CrystalCrystal
CucumberCucumber
CudaCuda
CycriptCycript
CythonCython
DD
D-ObjDumpD-ObjDump
Darcs PatchDarcs Patch
DartDart
desktopdesktop
DiffDiff
DMDM
DockerfileDockerfile
DogescriptDogescript
DTraceDTrace
DylanDylan
EE
EagleEagle
eCeC
Ecere ProjectsEcere Projects
ECLECL
ednedn
EiffelEiffel
ElixirElixir
ElmElm
Emacs LispEmacs Lisp
EmberScriptEmberScript
ErlangErlang
F#F#
FactorFactor
FancyFancy
FantomFantom
FilterscriptFilterscript
fishfish
FLUXFLUX
FormattedFormatted
ForthForth
FORTRANFORTRAN
FregeFrege
G-codeG-code
Game Maker LanguageGame Maker Language
GAMSGAMS
GAPGAP
GASGAS
GDScriptGDScript
GenshiGenshi
Gentoo EbuildGentoo Ebuild
Gentoo EclassGentoo Eclass
Gettext CatalogGettext Catalog
GLSLGLSL
GlyphGlyph
GnuplotGnuplot
GoloGolo
GosuGosu
GraceGrace
GradleGradle
Grammatical FrameworkGrammatical Framework
Graph Modeling LanguageGraph Modeling Language
Graphviz (DOT)Graphviz (DOT)
GroffGroff
GroovyGroovy
Groovy Server PagesGroovy Server Pages
HackHack
HamlHaml
HandlebarsHandlebars
HarbourHarbour
HaxeHaxe
HTML+DjangoHTML+Django
HTML+ERBHTML+ERB
HTML+PHPHTML+PHP
HTTPHTTP
HyHy
IDLIDL
IdrisIdris
IGOR ProIGOR Pro
Inform 7Inform 7
INIINI
Inno SetupInno Setup
IoIo
IokeIoke
IRC logIRC log
IsabelleIsabelle
JJ
JadeJade
JasminJasmin
Java Server PagesJava Server Pages
JSONJSON
JSON5JSON5
JSONiqJSONiq
JSONLDJSONLD
JuliaJulia
KitKit
KotlinKotlin
KRLKRL
LabVIEWLabVIEW
LassoLasso
LatteLatte
LeanLean
LessLess
LFELFE
LilyPondLilyPond
LiquidLiquid
Literate AgdaLiterate Agda
Literate CoffeeScriptLiterate CoffeeScript
Literate HaskellLiterate Haskell
LiveScriptLiveScript
LLVMLLVM
LogosLogos
LogtalkLogtalk
LOLCODELOLCODE
LookMLLookML
LoomScriptLoomScript
LSLLSL
MM
MakefileMakefile
MakoMako
MarkdownMarkdown
MaskMask
MathematicaMathematica
Maven POMMaven POM
MaxMax
MediaWikiMediaWiki
MercuryMercury
MiniDMiniD
MirahMirah
ModelicaModelica
MonkeyMonkey
MoocodeMoocode
MoonScriptMoonScript
MTMLMTML
MUFMUF
mupadmupad
MyghtyMyghty
NemerleNemerle
nesCnesC
NetLinxNetLinx
NetLinx+ERBNetLinx+ERB
NetLogoNetLogo
NewLispNewLisp
NginxNginx
NimrodNimrod
NinjaNinja
NitNit
NixNix
NLNL
NSISNSIS
NuNu
NumPyNumPy
ObjDumpObjDump
Objective-C++Objective-C%2B%2B
Objective-JObjective-J
OCamlOCaml
OmgroflOmgrofl
oocooc
OpaOpa
OpalOpal
OpenCLOpenCL
OpenEdge ABLOpenEdge ABL
OpenSCADOpenSCAD
OrgOrg
OxOx
OxygeneOxygene
OzOz
PanPan
PapyrusPapyrus
ParrotParrot
Parrot AssemblyParrot Assembly
Parrot Internal RepresentationParrot Internal Representation
PascalPascal
PAWNPAWN
Perl6Perl6
PigLatinPigLatin
PikePike
PLpgSQLPLpgSQL
PLSQLPLSQL
PodPod
PogoScriptPogoScript
PostScriptPostScript
PowerShellPowerShell
ProcessingProcessing
PrologProlog
Propeller SpinPropeller Spin
Protocol BufferProtocol Buffer
Public KeyPublic Key
PuppetPuppet
Pure DataPure Data
PureBasicPureBasic
PureScriptPureScript
Python tracebackPython traceback
QMakeQMake
QMLQML
RacketRacket
Ragel in Ruby HostRagel in Ruby Host
RAMLRAML
Raw token dataRaw token data
RDocRDoc
REALbasicREALbasic
RebolRebol
RedRed
RedcodeRedcode
RenderScriptRenderScript
reStructuredTextreStructuredText
RHTMLRHTML
RMarkdownRMarkdown
RobotFrameworkRobotFramework
RougeRouge
RustRust
SageSage
SaltStackSaltStack
SASSAS
SassSass
ScamlScaml
SchemeScheme
ScilabScilab
SCSSSCSS
SelfSelf
ShellSessionShellSession
ShenShen
SlashSlash
SlimSlim
SmalltalkSmalltalk
SmartySmarty
SourcePawnSourcePawn
SPARQLSPARQL
SQFSQF
SQLSQL
SQLPLSQLPL
SquirrelSquirrel
Standard MLStandard ML
StataStata
STONSTON
StylusStylus
SuperColliderSuperCollider
SVGSVG
SystemVerilogSystemVerilog
TclTcl
TcshTcsh
TeaTea
TextText
TextileTextile
ThriftThrift
TOMLTOML
TuringTuring
TurtleTurtle
TwigTwig
TXLTXL
TypeScriptTypeScript
Unified Parallel CUnified Parallel C
UnrealScriptUnrealScript
ValaVala
VCLVCL
VerilogVerilog
VHDLVHDL
Visual BasicVisual Basic
VoltVolt
Web Ontology LanguageWeb Ontology Language
WebIDLWebIDL
wispwisp
xBasexBase
XCXC
XMLXML
XojoXojo
XProcXProc
XQueryXQuery
XSXS
XSLTXSLT
XtendXtend
YAMLYAML
ZephirZephir
ZimplZimpl

The limitation of this method is that you can only filter on one programming language. Fortunately there is a way to overcome this — with the help of Google.

2. X-raying GitHub

Finding people who are using one programming language might be enough for you, but often you need proficiency in at least 2. For that, you are going to need the ability to X-ray Github.

Unlike Stackoverflow, GitHub has no part in the URL which directly shows you that a certain page is a profile page. So when you are searching GitHub you have to use a similar technique than what you can do with LinkedInlocate certain parts of the page which are specific to user entries.  An example for that might be the text “contributions in the last year” found here.

contrib

Check this search to find just user profiles.

site:github.com “contributions in the last year”

The minimum you would like to search for is again the location and the programming language, and that is extremely easy: just add it to at the end of the string. The language is not visible in the rendered page, but is in the source code.

source code

So to find software engineers in London who program in javascript and python, simply search for:

site:github.com “contributions in the last year” python javascript london

You can add more locations with an OR statement — keep in mind that some users will have cities, some will have countries.  You can use the languages from the table above.

3. Searching for codes on GitHub

Software engineers use the internal search engine of GitHub to find codes from other developers (and the “folders” storing codes, the repositories).  After you found the code, you can obviously check who created/contributed to it. This presents the opportunity to find people based on the output of their work — what better way can you imagine to narrow down your pool to people who are skilled for the job?

There is unfortunately a downside with this search method. You can not filter based on location (would not really make sense for developers to find code just from f.e. France, would it?).  Adding additional criteria on the results page in the form of location:Paris does not change anything in the output as well. So this method is mostly useful for virtual opportunities or for companies with relocation packages.

The idea requires cooperation with your Hiring Manager. Ask them about a function, a short piece of example code the future hire will work with, and might currently use as well. Then you can return to the GitHub advanced search and use it to find codes of software developers containing the part provided by your HM. Once you find a code, you found the user who created it.

Let’s use the inverse square root implementation in Quake 3 (found here, supposedly it’s a difficult code :)) as an example to show this in practice. It’s C++ code, so you either have to select C++ in the dropdown list of the languages or use .cc or .cpp as extensions.

Here is what you will get, a ton of software developers using float invsqrt in C++ in a similar way than in the code used by your company/client.

float

The key with this is obviously your connection with the Hiring Manager, and whether he/she can come up with a specially important code example.

4. Finding contact details of candidates found on GitHub

Needless to say, the hunt for contact details is going to be the most difficult part while sourcing on GitHub. Thankfully in more and more cases there is an Email address or a website listed on the profile, and you can contact your potential candidate on these. Things get a little more complicated if this information is not given. One of your options to contact someone is to use one of the extensions mentioned in the main post, just the other way around — finding other profiles based on the GitHub profile. Alternatives to that are running a Google Image Search (right click — search Google with this Image), or running a username check on namechk.com

There used to be a wonderful trick with an API link which showed you the Email address of ALL users, but that has been addressed by GitHub and is not showing all addresses anymore. You can still check it, because some users might still have the addresses and/or you can find other useful information like a Gravatar profile link.The API link is https://api.github.com/users/flavienlaurent/ , where for any other user, just change the “flavienlaurent” part of the username.

If for some reason you have found and read this article before reading the post about what other — even more important — ways are there to use GitHub and StackOverflow in technical recruitment, be sure to read it now.

Spread the word!Tweet about this on TwitterShare on Google+Share on LinkedInShare on FacebookShare on TumblrPin on PinterestEmail this to someonePrint this page

0 Comments

  1. Great work Vince. I’ve tried to get location to work with search operators as they claim it can here:

    https://help.github.com/articles/searching-github/#limitations-on-query-length

    Would love to learn how you store the contacts that you extract from this type of search if you are looking in bulk. I have been using Blockspring with great success for this and other sources. Would love to chat online sometime.

  2. Vinceszy

    January 6, 2016 at 11:19 pm

    Thanks Aaron!

    In my experience, the onsite location search works good, the only problems might be what people enter there (or do not enter there).

    I am using import.io to turn the result list of github to excel, but that only works good with onsite search. I would definitely be in for a chat, what is your preferred medium? 🙂

Leave a Reply

© 2017 Sourcing & Recruitment Info

Rights owned by Vince Szymczak Up ↑

Share this with your colleagues!
Hide Buttons