the realm of Amber Brown

50,000,000 Twisted Downloads Can't Be Wrong

2019.06.09

🎵 Currently listening to: Careful What You Pack – They Might Be Giants (The Else) 🎵

You know how it is – 50,000,000 of anything can’t be wrong. But, when looking at Twisted’s downloads, there is indeed something wrong – even if there’s been more than 50 million.

I was told during the 2019 Python Language Summit that I should drop support for Python 2.

Van Rossum argued instead that if the Twisted team wants the ecosystem to evolve, they should stop supporting older Python versions and force users to upgrade. Brown acknowledged this point, but said half of Twisted users are still on Python 2 and it is difficult to abandon them.

I think it’s worth looking a little closer at why it is difficult to abandon them.


A Quick Note on Statistics, & Other Lies

I say “half of Twisted users”. This might be overstating or understating the real numbers, as we’ll shortly see. I’m using PyPI downloads as a metric of adoption, as Twisted does not call home and I have no way of knowing the relative penetration of things such as Debian or Red Hat packagings of Twisted.

This is unfortunate, but I can make a number of assumptions about the users of distribution packages:

  • They are possibly using an application using Twisted, and don’t count in my sense of “user”, as I am counting developers (users of the library) and not end-users
  • They are almost certainly using older versions with inadequate Python 3 support, and so we can assume that they are more often Python 2 users than not.
  • They are the users that will only see the effects of the present day in several years, and so adoption will of course lag behind.

So, we can start off with the assumption that any Twisted users using distro-packaged Twisted are more likely to be using Python 2 than not, and weight our conclusions with this estimate accordingly.


Analysing the Downloads

PyPI publishes download statistics. If we look at a reasonable sample set (the downloads of the past 30 days), we get some interesting numbers:

Python Version Downloads (30d)
2.7 792877
3.4 15395
3.5 52376
3.6 208767
3.7 128079
3.8 727

We can clearly see that Python 2.7 is Twisted’s most popular platform, followed by Python 3.6 and Python 3.7. Python 3.4 (a version currently unsupported by the current version) and Python 3.5 are falling into disuse, but there’s still half the amount of users of these old platforms than the current stable version of Python. Python 3.8 (currently in beta) is virtually a rounding error (as expected).

Python 3, combined, just has over half the downloads of Python 2.7 alone.

These adoption numbers seem fairly dire, but they don’t paint the entire picture. What if we looked at specific versions?

Libraries.io says that Twisted’s top two pinned versions are 17.9.0 and 13.2.0 (my first release as RM!). If we inspect the statistics for those two versions as well as the present one (19.2.1 – although it hasn’t been out for 30 days, so I’ll look at 19.2.0 as well), there’s some interesting numbers:

Python Version 13.2.0 17.9.0 19.2.x Total
2.7 97.46% 37.73% 37.23% 66.17%
3.4 0.07% 0.66% 0.92% 1.28%
3.5 0.00% 27.91% 6.77% 4.37%
3.6 1.82% 30.91% 32.31% 17.42%
3.7 0.65% 2.79% 22.62% 10.69%
3.8 0.00% 0.00% 0.15% 0.06%
% Share 0.36% 2.18% 40.50% 100.00%

Note: The % Share is the share of downloads in the past 30 days that version of Twisted accounted for.

Installations of Twisted in the past 30 days indicate that those that use new Twisted versions, also use new Python versions. A significant chunk of current Twisted users still use Python 2.7, though.

Although it is improving (Python 2.7 downloads were some ~47% of Twisted 18.9 when that version was current), Python 2 users are still a substantial chunk of Twisted’s install base. It can be argued that Twisted need not worry about the 66% of total downloads, as we should actually be looking at 19.2’s 37% – but 37% is still a concerning number.


Twisted Is The Long Tail

Twisted is still not fully ported to Python 3. This is for several reasons – developer time, developer interest, and feasibility.

Developer time is not free. Twisted is currently nearly entirely a volunteer project, with paid time coming in the handful of hours that people can justify their employers spending on the codebase. For a lot of employers, it’s difficult to make a case for more – Twisted, on the whole, just works. So does Python 2.7. It may not be pretty or have syntax niceties, but combined with an established Twisted codebase (that has already paid the cost of figuring out how to do it reliably in Python 2), it also just works. Up until async/await, there was not any ‘killer feature’ to even begin enticing Twisted users to Python 3. It only meant more work (since even Unicode-clean codebases can take many, many person-hours to port), and a slower runtime.

Even today, the only benefits that a Python 2+3 compatible Twisted application (like Synapse) gets from Python 3 is Flexible String Representation (giving significant memory savings when handling Unicode) and Python 3’s universal use of iterators (which mostly meant improvements in the standard library being lazy, not your own code). It’s nice, of course, but incremental improvements are not how you court upgrades.

Twisted itself also took a long time to port, producing a knock-on effect in the library’s ecosystem. 2to3 never worked for Twisted, and it wasn’t until Python 3.3 (2012 – four years after 3.0) that the development of Twisted as a 2+3 compatible codebase could reasonably begin. Even with my efforts (beginning in 2013), it wasn’t until 2016 or 2017 that Twisted could be said to be ported enough to reasonably use for many applications. Some parts (like IRC support) have not yet been ported, meaning software using it is set back even further in the porting process.


What to do?

This leaves Twisted in an interesting situation.

Our user base on Python 2.7 is likely not moving to Python 3 any time soon. The Python 2 EOL date is only symbolic as long as distributions ship Python 2.7, and for many applications it might simply not be worth it to put the effort in to port something that already runs, and runs well – even if the porting is theoretically easy. This means Twisted needs to deal with the long tail of Python 3 adoption.

There is a grain of truth to van Rossum’s comment. The only way these users will upgrade is if they are forced. My concern is that they will not upgrade to Python 3, they will ‘upgrade’ to Go, Rust, or some other language. My personal experience over the past couple years is that this is more common than one would think. Many companies are willing to invest in the possibilities of a fresh rewrite in what could be seen as a better language than paying the costs of migrating the existing software. As much as we know that rewrites are nearly always pointless folly, it still happens. I would rather not force this ultimatum on my users, as they might end up not adding to the ecosystem after all, but feel burned by deciding to use Python to begin with.

As such, I feel Twisted needs to adopt a harm-reduction method of migration from Python 2.

I posted a proposal to the Twisted mailing list detailing my proposal for such a method. It makes the assumption that Python 2 users are the long tail in other areas and are unlikely to adopt new features and so declaring a final version for 2.7 which can be separately patched may be an option.

I know that several in the community (e.g. Glyph :) ) are opposed to an LTS of Twisted, which this would presumably be. But, if we cannot drop Python 2.7 support wholesale, the maintenance burden of an “LTS” may well end up lower than preserving Python 2.7 compatibility in current versions of Twisted.


But, in the end…

As an open source maintainer, it’s hard to say I don’t feel some sort of duty to those that use my software. Maybe that sense of duty is misplaced, and the Twisted maintainers should stop going to the effort to maintain software that is, by this point, old enough to begin learning to program in Python. This makes it hard to decide to leave users behind (even if they really should be upgrading), especially users that potentially don’t have a viable upgrade option.

None of those personal feelings matter, anyway. I’m just the release manager, not the entirety of Twisted – and we have collectively decided that it’s not time to throw Python 2.7 by the wayside yet. No matter how we got here, the cost to our users is still simply too great.