Navigate file history faster with improved blame view

Whether you're debugging a regression or trying to understand how some code came to have its current shape, you'll often want to see what a file looked like before a particular change. With improved blame view, you can easily see how any portion of your file has evolved over time without viewing the file's full history.

use-the-improved-blame-view-to-see-what-the-file-looked-like-before-a-particular-change

Check out the GitHub Help documentation for more information on using git blame to trace the changes in a file.

Flatiron School joins the GitHub Student Developer Pack

Flatiron School has joined the Student Developer Pack to offer students one free month of their Community-Powered Bootcamp, a flexible online course in web development.

Flatiron School joins the Student Developer Pack

The Community-Powered Bootcamp is a self-paced subscription program for beginners. You'll learn online using the same course of study as the Web Developer Program—a comprehensive curriculum tailored to job seekers. In a month, you can pick up a few in-demand skills and work with a community of other learners to start reaching your goals, whether they are technical literacy, a new programming language, or a new career.

The details

  • Get the first month of tuition free
  • Start 800+ hours of rigorous web development coursework
  • Take on topics like HTML, CSS, JavaScript, Node.JS, React, and Ruby on Rails
  • Learn online and at your own pace with a curated community of students
  • Build a portfolio
  • Get help when you need it, 24/7

After one month, you can sign up for a monthly subscription of $149 USD.

The Student Developer Pack gives students free access to the best developer tools from different technology companies like Datadog, GitKraken, Travis CI, and Unreal Engine. Sign up for the pack, and start learning.

Git Merge scholarships and more

gitmergeheader

Brussels will play host to Git Merge 2017 in February, and planning is already well underway.

We're building Git Merge to be welcoming to and supportive of everyone in the vibrant Git community. To this end, 100% of conference proceeds will once again go to the Software Freedom Conservancy to protect and further FLOSS projects. We are also pleased to offer scholarships as part of our commitment to accessibility and inclusion at GitHub events and to bring Git Merge to a wider audience.

The Git Merge scholarships consist of a number of discounted student tickets and complimentary tickets for people from currently underrepresented groups in tech. We reserve 10% of tickets to all of our events for scholarships and distribute them through partner organizations in the area serving technologists from underrepresented groups. The Git Merge 2017 partners are Rails Girls Belgium, part of the global Rails Girls movement for women in tech, and Operation Code which supports military veterans and their families learning to code.

Finally, for the first time, we are taking individual applications for scholarship tickets through the Travis Foundation's Diversity Tickets program which makes it easier for events of any size anywhere in the world to reach a more diverse audience. Applications close on January 13th, so there's still time to apply and spread the word!

Bug Bounty anniversary promotion: bigger bounties in January and February

Extra payouts for GitHub Bug Bounty Third Year Anniversary

The GitHub Bug Bounty Program is turning three years old. To celebrate, we're offering bigger bounties for the most severe bugs found in January and February.

The bigger the bug, the bigger the prize

The process is the same as always: hackers and security researchers find and report vulnerabilities through our responsible disclosure process. To recognize the effort these researchers put forth, we reward them with actual money. Standard bounties range between $500 and $10,000 USD and are determined at our discretion, based on overall severity. In January and February we're throwing in bonus rewards for standout individual reports in addition to the usual payouts.

Bug bounty prizes are $12,000, $8,000, $5,000 on top of the usual payouts

And t-shirts obviously

In addition to cash prizes, we've also made limited edition t-shirts to thank you for helping us hunt down GitHub bugs. We don't have enough for everyone—just for the 15 submitters with the most severe bugs.

Enterprise bugs count, too

GitHub Enterprise is now included in the bounty program. So go ahead and find some Enterprise bugs. If they're big enough you'll be eligible for the promotional bounty. Otherwise, rewards are the same as GitHub.com ($200 to $10,000 USD). For more details, visit our bounty site.

Giving winners some extra cash doesn't mean anyone has to lose. If you find a bug, you'll still receive the standard bounties.

Happy hunting!

Visualize your project's community

A new graph is available in the Graphs tab to visualize your repository's data. With the dependents graph, you can now explore how repositories that contain Ruby gems relate to other repositories on GitHub.

If you're an open source maintainer, this means you can find out more about the community connected to your project in addition to projects that depend on your repository and its forks.

Screenshot of dependents page

The page starts with a list of the latest repositories to depend on your repository, making it easier to discover the newest members of your community. It also allows you to filter by either packages, which are other repositories that are gems, or applications, which are other public repositories that aren't gems themselves but use your gem.

The dependency graph works for Ruby gems today, and we plan to expand support to other package ecosystems in the future. For more on what graphs can tell you about your project, check out our Help guide on Graphs.

Search commit messages

You can now search for commits from either the main search page or within a repository. Quickly discover who removed set -e or find commits that involved refactoring.

Commit search

Check out the GitHub Help documentation for more information on how to search commits.

Save the date: GitHub Universe 2017

GitHub Universe September 2016

GitHub Universe returns in 2017, and we already have some surprises in store for you. Mark your calendars for October 10-12, 2017 at Pier 70 in San Francisco.

Super Early Bird Tickets available now

We're releasing a limited amount of tickets at a super early bird price of $199 USD. There are only 100 tickets available, so make sure to snag yours before they run out.

Audience at GitHub Universe

GitHub Universe is the three-day event for people making the future of software. Immerse yourself in creativity and curiosity with the largest software community in the world. The event is packed with advanced training, deep dives on open source projects, keynotes from industry experts, and a look into successful software teams.

Check out the videos from 2016 at githubuniverse.com.

universe_get_tickets_button

New theme chooser for GitHub Pages

You can now build a GitHub Pages website with a Jekyll theme in just a few clicks.

  1. Create a new GitHub repository or go to an existing one.
  2. Open the theme chooser in the GitHub Pages section of your repository settings.
  3. Select a theme.

Theme chooser screenshot

Using a Jekyll theme means that your website content lives in Markdown files, which you can edit as needed and manage using your favorite Git workflow.

As soon as you apply a Jekyll theme to your site, you can add more pages simply by committing new Markdown files.

The theme chooser replaces the old automatic page generator which didn't use Jekyll. Rest assured, existing GitHub Pages created with the automatic page generator will automatically use a matching Jekyll theme the first time you use the theme chooser.

Finally, the Jekyll themes in the theme chooser are all open sourced on GitHub.

For additional information, check out the documentation.

Git Merge 2017: the full agenda is now live

gitmergeheader

The complete agenda for Git Merge 2017 is now live. Check it out.

Learn how companies like Facebook, Microsoft, GitHub, Autodesk, Yubico, MIT, Atlassian, and the Software Freedom Conservancy are using Git and how you can apply their process within your team. You'll also meet other developers and join hands-on training courses.

Sample sessions

Scaling Mercurial at Facebook: Insights from the Other Side
Facebook uses Mercurial to host some of the largest, fastest growing distributed version control repositories in the world. In this session they’ll talk about the specific technical and user experience improvements they’ve open sourced to handle our growing scale, with an emphasis on lessons relevant to Git and the Git community.

Git LFS at Light Speed
Git and its extensions are becoming more popular than ever. However, certain use cases may still be suboptimal. We identified a way to dramatically improve performance in a popular Git extension—LFS (Large File Storage)—that required changes to both Git Core and the extension itself. We’ll walk you through the process of a successful contribution to each project with the help of mailing lists and pull requests. If you already have a bit of Git command line knowledge then this talk will prepare you for your first contribution to Git, an extension, or both.

Top Ten Worst Repositories to Host on GitHub
In this talk we'll see what technologies GitHub has developed to handle the more challenging repositories and use-cases, from heuristics to replication and quotas, as well as what it takes to backup this data.

Confirmed speakers

Karen Sandler - Executive Director at Software Freedom Conservancy
Karen M. Sandler is the executive director of the Software Freedom Conservancy. Karen is known for her advocacy for free software, particularly as a cyborg in relation to the software on medical devices. Prior to joining the Conservancy, she was executive director of the GNOME Foundation. Before that, she was general counsel of the Software Freedom Law Center. Karen co-organizes Outreachy, the award-winning outreach program for women. She is also pro bono lawyer to the FSF and GNOME. Karen is a recipient of the O’Reilly Open Source Award and cohost of the oggcast Free as in Freedom.

Durham Goode - Tech Lead, Source Control Team at Facebook
Durham is the tech lead on the Source Control team at Facebook. He has spent the past four years making distributed version control scale to some of the largest repositories in the world. He has helped teach thousands of engineers to use source control and has a keen interest in making it more approachable to everyone.

Caren Garcia - Implementation Engineer at BazaarVoice
Caren is an Implementation Engineer at BazaarVoice in Austin, Texas. She's an organizer for her local chapter of Women Who Code, a perennial optimist and enjoys delicious German beers, fika, tacos, and travel. She is an Alumna of and Teaching Assistant at the University of Texas.

Santiago Perez De Rosso - PhD Student, Software Design Group at MIT
Santiago P. De Rosso is a PhD student in the Software Design Group at MIT. He used to work at Google, developing tools to make engineers more productive. He currently spends most of his time thinking about how to make software and the process of software engineering better.

In-Depth Workshops

Git and the Terrible, Horrible, No Good, Very Bad Day
We searched the world over for the gnarliest, most terrifying Git scenarios we could find. In this caffeine-fueled session, you will learn how to use some of the more advanced porcelain commands to detangle all the things.

The Battle for Sub-premacy
Submodules or Subtrees? Both are proposed as solutions for handling dependencies. In this session the gloves are coming off. Which one will win it all?

Jedi Mind Tricks for Git
Learn to channel the Git force and improve your workflows using customized configurations, attributes, and hooks.

Repo 911
Is your repository out of control? Is it so unwieldy and awkward you are embarrassed to be seen with it? It's time to take control. Learn how to clean up your repository with filter-branch and BFG, then use git-lfs for a healthier tomorrow.

Git Simple: Writing Primary Git Functionalities in Ruby
Git can seem unapproachable to new users. Even more seasoned users can forget the simplicity that underpins Git. In this session, we will write the remedial functionalities of Git in Ruby. Because of Ruby's approachable syntax, no previous Ruby experience is needed to follow this talk.

See you in Brussels!

view_agenda_button

Game Off IV Highlights

Last month, we challenged you to create a game based on the theme hacking, modding, and/or augmenting. For all those that submitted entries, thank you! A little holiday gift will be working its way to your inbox shortly.

Here's a selection of some of our favorites that you can play, hack on, or learn from. Enjoy!

Sir Jumpelot

Sir Jumpelot » Play in browser · View source

The main character (who has a strong resemblance to Hubot) must survive waves of enemies with some calculated jumping. Be careful: new mods change the game mechanics as you play. Created by @MelvinPoppelaars with Unity and hosted on itch.io.

Lighting

Lighting screenshot » Play in browser · View source

This illuminating game is inspired by a couple of real-life hacks on MIT's Green Building and the Cira Centre in Philadelphia. Created by @hiddenwaffle using three.js, tween.js, and howler.js. Hosted on GitHub Pages.

Muntz

Muntz screenshot » Play in browser · View source

A puzzle game inspired by the work of TV Pioneer Earl "Madman" Muntz, a TV hacker from the 1950s. Attempt to simplify increasingly complex circuits. Learn a little about American TV and broadcast history in the process. Created by @joemagiv with Unity.

Boxie Coody

Boxie Coody » Play in browser · View source

A Sokoban-like puzzle game, where a friendly bot called Coody teaches you how to complete your work, and it gets more challenging every day. Created by @LastFlower, @LastLeaf, and @tongtongggggg. Graphics created with Inkscape, sound effects created with LMMS and GeneralUser GS. Hosted on GitHub Pages.

Pongout

Pongout » Play in browser (mobile friendly) · View source

What happens when you remix Pong with Breakout? Pongout! Created by @kurehajime and written in JavaScript. Hosted on GitHub Pages.

Mine Hacker

Mine Hacker » Download · View source

A dungeon-crawling, roguelike Minesweeper game in which you navigate a map of hidden tiles by using the small amount of information provided by the robot's sensors. Created by @BelowParallelStudios using Unity. Music was created with Bosca Ceoil. Graphics created with Aseprite and GIMP.

Byter

Byter » Play in browser · View source · Download

Byter is an open source clicker game, where you have to hack various targets. Created by @KyleBanks with Unity.

Hacking in Progress

Hacking in Progress » Download (Win) · View source

Run around the computer store in this 80s-style stealth hacking game, but be sure to avoid the clerks and security cameras. Created by @flipcoder using their very own open source 2D/3D OpenGL game engine called Qor.

Airplane

Airplane » Play in browser · View source

Enjoy this one with a friend or coworker as you try to destroy one another's airplanes. Created by @IonicaBizau. Written in JavaScript and hosted on GitHub Pages.

The Terminal

The Terminal » Download (Windows) · Build from source (All platforms)

A cooperative hacking game in the style of Keep Talking and Nobody Explodes. One player must hack into a computer terminal, while the other player follows instructions from the manual. Created by @Juzley and @paulo6. Written in Python using Pygame.

Hackshot

Hackshot » Play in browser · View source

A JavaScript coding game where you programmatically control a cannon to take down hordes of incoming enemies. Created by @buch415 using CodeMirror for the code editor and syntax highlighting. Hosted on GitHub Pages.

Code Explorer

Code Explorer » Play in browser · View source

You control a voxel programmer as he (literally) jumps through the program's code. What will happen when you reach the end? Play to quine out. Created by @michalbe using JavaScript and hosted on GitHub Pages.

Pwnterman

Pwnterman » Play in browser · View source

The README.md file suggests it's the worst game ever made. We beg to differ. This was made in two hours, and it's controlled with Vim Key Bindings H J K L. What's not to like? Created by @joshbressers.

Unnamed Hacking Themed Game

Unnamed Hacking Themed Game » Download (Win) · View source

Use various attacks to alter enemy defenses and objects to your advantage in this 2D action platforming game. Created by @DamienPWright using Unity.

Tech Wars

Tech Wars » Play in browser · View source · Download (Win, Mac, Linux)

The team at Vital built a Unity game that introduces a digitally enhanced virus to a tech startup. Help your infected coworkers using your Nerf darts dipped in the antidote.

Mona's Escape

Mona's Escape » Play in browser · View source

Here's a small demo where you play as Mona trying to escape a containment cell in Area 51. Watch out for the guards, though. Cat and octopus hybrids are are not well equipped for combat. Created by @joshuashoemaker using Pixi.js

Knot Fun

Knot Fun » View source · Store: Android + iOS

Don't let the name mislead you. This game about untangling knots is lotsa fun.* Created by @AStox using C#, Unity, and GIMP.

* If you love untangling your earphone cords, Christmas tree lights, kids' shoelaces, etc.

TPS Report Simulator

TPS Report Simulator » Play in browser · View source

Don't let your coworkers stop you from hacking all the terminals on your way out the door. Use the red staplers and ISP CDs to your advantage. Created by @bjshively and @spaghettioh using the JavaScript game engine, Phaser. Hosted on Heroku.

Zombie Chess

Zombie Chess » Play in browser · View source

Using open source libraries and frameworks, Gravitywell and friends created a chess game that sees iconic rock musicians fight against contemporary EDM DJs. Built with Meteor.js, Stockfish.js (AI), chessboard.js (UI), and Howler.js (SFX).

Immolation Organization

Immolation Organization » Download from the Apple App Store · Build from source

Take control of a mob that attempts to control a group of bad guys. Created by @EtherTyper, @AnimatorJoe, and @SamHollenbeck from Westlake High School's Accessible Programming Club using. Created with Swift and SpriteKit. Available too from Apple's App Store.

Hackable Mi

Hackable Mi » Play in browser · View source

Guide Mi through an array of puzzles and mazes with code. Created by @Vandise and hosted on Heroku.

Octopout

Octopout » Download (Win) · View source

In this Breakout clone, octopusses have been transformed into metal creatures. Created by @panoramix360, @guilhermekf, @fpjuni, and @stopellileo using GameMaker.

Resolve simple merge conflicts on GitHub

You can now resolve simple merge conflicts on GitHub right from your pull requests, saving you a trip to the command line and helping your team merge pull requests faster.

Demonstrating how to resolve a merge conflict

The new feature helps you resolve conflicts caused by competing line changes, like when people make different changes to the same line of the same file on different branches in your Git repository. You'll still have to resolve other, more complicated conflicts locally on the command line.

For more on merge conflicts and how to make them disappear on GitHub and on the command line, check out the GitHub Help documentation.

Publishing with GitHub Pages, now as easy as 1, 2, 3

Publishing a website or software documentation with GitHub Pages now requires far fewer steps — three to be exact:

  1. Create a repository (or navigate to an existing repository)
  2. Commit a Markdown file via the web interface, just like you would any other file
  3. Activate GitHub Pages via your repository's settings

And that's it — you now have a website. If you're already familiar with GitHub Pages, you may be interested to know that behind the scenes, we're now doing a few things to simplify the publishing experience and bring it more in line with what you may expect from authoring Markdown content elsewhere on GitHub:

  1. All Markdown files are now rendered by GitHub Pages, saving you from needing to add YAML front matter (the metadata at the top of the file separated by ---s) to each file.

  2. We'll use your README file as the site's index if you don't have an index.md (or index.html), not dissimilar from when you browse to a repository on GitHub.

  3. If you don't specify a theme in your site's config (or don't have a config file at all), we'll set a minimal, default theme that matches the look and feel of Markdown elsewhere on GitHub.

  4. If a given file doesn't have a layout specified, we'll assign one based on its context. For example, pages will automatically get the page layout, or the default layout, if the page layout doesn't exist.

  5. If your page doesn't have an explicit title, and the file begins with an H1, H2, or H3, we'll use that heading as the page's title, which appears in places like browser tabs.

These improvements should allow you to quickly and easily publish your first (or 100th) website with just a few clicks, or to document your software project by simply adding Markdown files to a /docs folder within your repository. Of course, you can continue to control the look and feel by opting in to additional customizations (such as overriding the default theme with your own layouts or styles).

While these changes shouldn't affect how most existing sites build, there are two potential gotchas for some more advanced Jekyll users:

  1. If your site iterates through all pages (e.g., for page in site.pages), you may find that there are now additional pages (such as the README of a vendored dependency) in that list. You can explicitly exclude these files with your config file's exclude directive.

  2. If you don't specify a page's layout or title, and expect either to be unset (e.g., if you need to serve unstyled content), you'll need to explicitly set those values as null.

And if for any reason you don't want these features, you can disable them by adding a .nojekyll file to your site's root directory.

So that the GitHub Pages build process can be as transparent and customizable as possible, all the above features are implemented as open source Jekyll plugins, namely Jekyll Optional Front Matter, Jekyll README Index, Jekyll Default Layout, and Jekyll Titles from Headings.

Again, these changes shouldn't affect how most existing sites build (although you can safely begin to use these features), but if you have any questions, please get in touch with us.

Happy three-step publishing!

Introducing review requests

You can now request a review explicitly from collaborators, making it easier to specify who you'd like to review your pull request.

You can also see a list of people who you are awaiting review from in the pull request page sidebar, as well as the status of reviews from those who have already left them.

gif of requesting review from sidebar

Pending requests for review will also show in the merge box. They do not affect mergability, however, so you can still merge your pull request even if you are still awaiting review from another collaborator.

image of merge box showing a requested review

Learn more about requesting reviews in our Help docs.

Relative links for GitHub pages

You've been able to use relative links when authoring Markdown on GitHub.com for a while. Now, those links will continue to work when published via GitHub Pages.

If you have a Markdown file in your repository at docs/page.md, and you want to link from that file to docs/another-page.md, you can do so with the following markup:

[a relative link](another-page.md)

When you view the source file on GitHub.com, the relative link will continue to work, as it has before, but now, when you publish that file using GitHub Pages, the link will be silently translated to docs/another-page.html to match the target page's published URL.

Under the hood, we're using the open source Jekyll Relative Links plugin, which is activated by default for all builds.

Relative links on GitHub Pages also take into account custom permalinks (e.g., permalink: /docs/page/) in a file's YAML front matter, as well as prepend project pages' base URL as appropriate, ensuring links continue to work in any context.

Happy (consistent) publishing!

Git 2.11 has been released

The open source Git project has just released Git 2.11.0, with features and bugfixes from over 70 contributors. Here's our look at some of the most interesting new features:

Abbreviated SHA-1 names

Git 2.11 prints longer abbreviated SHA-1 names and has better tools for dealing with ambiguous short SHA-1s.

You've probably noticed that Git object identifiers are really long strings of hex digits, like 66c22ba6fbe0724ecce3d82611ff0ec5c2b0255f. They're generated from the output of the SHA-1 hash function, which is always 160 bits, or 40 hexadecimal characters. Since the chance of any two SHA-1 names colliding is roughly the same as getting struck by lightning every year for the next eight years1, it's generally not something to worry about.

You've probably also noticed that 40-digit names are inconvenient to look at, type, or even cut-and-paste. To make this easier, Git often abbreviates identifiers when it prints them (like 66c22ba), and you can feed the abbreviated names back to other git commands. Unfortunately, collisions in shorter names are much more likely. For a seven-character name, we'd expect to see collisions in a repository with only tens of thousands of objects2.

To deal with this, Git checks for collisions when abbreviating object names. It starts at a relatively low number of digits (seven by default), and keeps adding digits until the result names a unique object in the repository. Likewise, when you provide an abbreviated SHA-1, Git will confirm that it unambiguously identifies a single object.

So far, so good. Git has done this for ages. What's the problem?

The issue is that repositories tend to grow over time, acquiring more and more objects. A name that's unique one day may not be the next. If you write an abbreviated SHA-1 in a bug report or commit message, it may become ambiguous as your project grows. This is exactly what happened in the Linux kernel repository; it now has over 5 million objects, meaning we'd expect collisions with names shorter than 12 hexadecimal characters. Old references like this one are now ambiguous and can't be inspected with commands like git show.

To address this, Git 2.11 ships with several improvements.

First, the minimum abbreviation length now scales with the number of objects in the repository. This isn't foolproof, as repositories do grow over time, but growing projects will quickly scale up to larger, future-proof lengths. If you use Git with even moderate-sized projects, you'll see commands like git log --oneline produce longer SHA-1 identifiers. [source]

That still leaves the question of what to do when you somehow do get an ambiguous short SHA-1. Git 2.11 has two features to help with that. One is that instead of simply complaining of the ambiguity, Git will print the list of candidates, along with some details of the objects. That usually gives enough information to decide which object you're interested in. [source]

SHA-1 candidate list

Of course, it's even more convenient if Git simply picks the object you wanted in the first place. A while ago, Git learned to use context to figure out which object you meant. For example, git log expects to see a commit (or a tag that points to a commit). But other commands, like git show, operate on any type of object; they have no context to guess which object you meant. You can now set the core.disambiguate config option to prefer a specific type. [source]

Automatically disambiguating between objects

Performance Optimizations

One of Git's goals has always been speed. While some of that comes from the overall design, there are a lot of opportunities to optimize the code itself. Almost every Git version ships with more optimizations, and 2.11 is no exception. Let's take a closer look at a few of the larger examples.

Delta Chains

Git 2.11 is faster at accessing delta chains in its object database, which should improve the performance of many common operations. To understand what's going on, we first have to know what the heck a delta chain is.

You may know that Git avoids storing files multiple times, because all data is stored in objects named after the SHA-1 of the contents. But in a version control system, we often see data that is almost identical (i.e., your files change just a little bit from version to version). Git stores these related objects as "deltas": one object is chosen as a base that is stored in full, and other objects are stored as a sequence of change instructions from that base, like "remove bytes 50-100" and "add in these new bytes at offset 50". The resulting deltas are a fraction of the size of the full object, and Git's storage ends up proportional to the size of the changes, not the size of all versions.

As files change over time, the most efficient base is often an adjacent version. If that base is itself a delta, then we may form a chain of deltas: version two is stored as a delta against version one, and then version three is stored as a delta against version two, and so on. But these chains can make it expensive to reconstruct the objects when we need them. Accessing version three in our example requires first reconstructing version two. As the chains get deeper and deeper, the cost of reconstructing intermediate versions gets larger.

For this reason, Git typically limits the depth of a given chain to 50 objects. However, when repacking using git gc --aggressive, the default is bumped to 250, with the assumption that it would make a significantly smaller pack. But that number was chosen somewhat arbitrarily, and it turns out that the ideal balance between size and CPU actually is around 50. So that's the default in Git 2.11, even for aggressive repacks. [source]

Even 50 deltas is a lot to go through to construct one object. To reduce the impact, Git keeps a cache of recently reconstructed objects. This works out well because deltas and their bases tend to be close together in history, so commands like git log which traverse history tend to need those intermediate bases again soon. That cache has an adjustable size, and has been bumped over the years as machines have gotten more RAM. But due to storing the cache in a fairly simple data structure, Git kept many fewer objects than it could, and frequently evicted entries at the wrong time.

In Git 2.11, the delta base cache has received a complete overhaul. Not only should it perform better out of the box (around 10% better on a large repository), but the improvements will scale up if you adjust the core.deltaBaseCacheLimit config option beyond its default of 96 megabytes. In one extreme case, setting it to 1 gigabyte improved the speed of a particular operation on the Linux kernel repository by 32%. [source, source]

Object Lookups

The delta base improvements help with accessing individual objects. But before we can access them, we have to find them. Recent versions of Git have optimized object lookups when there are multiple packfiles.

When you have a large number of objects, Git packs them together into "packfiles": single files that contain many objects along with an index for optimized lookups. A repository also accumulates packfiles as part of fetching or pushing, since Git uses them to transfer objects over the network. The number of packfiles may grow from day-to-day usage, until the next repack combines them into a single pack. Even though looking up an object in each packfile is efficient, if there are many packfiles Git has to do a linear search, checking each packfile in turn for the object.

Historically, Git has tried to reduce the cost of the linear search by caching the last pack in which an object was found and starting the next search there. This helps because most operations look up objects in order of their appearance in history, and packfiles tend to store segments of history. Looking in the same place as our last successful lookup often finds the object on the first try, and we don't have to check the other packs at all.

In Git 2.10, this "last pack" cache was replaced with a data structure to store the packs in most recently used (MRU) order. This speeds up object access, though it's only really noticeable when the number of packs gets out of hand.

In Git 2.11, this MRU strategy has been adapted to the repacking process itself, which previously did not even have a single "last found" cache. The speedups are consequently more dramatic here; repacking the Linux kernel from a 1000-pack state is over 70% faster. [source, source]

Patch IDs

Git 2.11 speeds up the computation of "patch IDs", which are used heavily by git rebase.

Patch IDs are a fingerprint of the changes made by a single commit. You can compare patch IDs to find "duplicate" commits: two changes at different points in history that make the exact same change. The rebase command uses patch IDs to find commits that have already been merged upstream.

Patch ID computation now avoids both merge commits and renames, improving the runtime of the duplicate check by a factor of 50 in some cases. [source, source]

Advanced filter processes

Git includes a "filter" mechanism which can be used to convert file contents to and from a local filesystem representation. This is what powers Git's line-ending conversion, but it can also execute arbitrary external programs. The Git LFS system hooks into Git by registering its own filter program.

The protocol that Git uses to communicate with the filter programs is very simple. It executes a separate filter for each file, writes the filter input, and reads back the filter output. If you have a large number of files to filter, the overhead of process startup can be significant, and it's hard for filters to share any resources (such as HTTP connections) among themselves.

Git 2.11 adds a second, slightly more complex protocol that can filter many files with a single process. This can reportedly improve checkout times with many Git LFS objects by as much as a factor of 80.

Git LFS improvements

The original protocol is still available for backwards compatibility, and the new protocol is designed to be extensible. Already there has been discussion of allowing it to operate asynchronously, so the filter can return results as they arrive. [source]

Sundries

  • In our post about Git 2.9, we mentioned some improvements to the diff algorithm to make the results easier to read (the --compaction-heuristic option). That algorithm did not become the default because there were some corner cases that it did not handle well. But after some very thorough analysis, Git 2.11 has an improved algorithm that behaves similarly but covers more cases and does not have any regressions. The new option goes under the name --indent-heuristic (and diff.indentHeuristic), and will likely become the default in a future version of Git. [source]

  • Ever wanted to see just the commits brought into a branch by a merge commit? Git now understands negative parent-number selectors, exclude the given parent (rather than selecting it). It may take a minute to wrap your head around that, but it means that git log 1234abcd^-1 will show all of the commits that were merged in by 1234abcd, but none of the commits that were already on the branch. You can also use ^- (omitting the 1) as a shorthand for ^-1. [source]

  • There's now a credential helper in contrib/ that can use GNOME libsecret to store your Git passwords. [source]

  • The git diff command now understands --submodule=diff (as well as setting the diff.submodule config to diff), which will show changes to submodules as an actual patch between the two submodule states. [source]

  • git status has a new machine-readable output format that is easier to parse and contains more information. Check it out if you're interested in scripting around Git. [source]

  • Work has continued on converting some of Git's shell scripts to C programs. This can drastically improve performance on platforms where extra processes are expensive (like Windows), especially in programs that may invoke sub-programs in a loop. [source, source]

The whole shebang

That's just a sampling of the changes in Git 2.11, which contains over 650 commits. Check out the the full release notes for the complete list.


[1] It's true. According to the National Weather Service, the odds of being struck by lightning are 1 in a million. That's about 1 in 220, so the odds of it happening in 8 consecutive years (starting with this year) are 1 in 2160.

[2] It turns out to be rather complicated to compute the probability of seeing a collision, but there are approximations. With 5 million objects, there's about a 1 in 1035 chance of a full SHA-1 collision, but the chance of a collision in 7 characters approaches 100%. The more commonly used metric is "numbers of items to reach a 50% chance of collision", which is the square root of the total number of possible items. If you're working with exponents, that's easy; you just halve the exponent. Each hex character represents 4 bits, so a 7-character name has 228 possibilities. That means we expect a collision around 214, or 16384 objects.