On Tuesday, a lot of the conversation in my neck of the internet was about the arraignment of activist & open access advocate Aaron Swartz on federal charges of wire fraud and unauthorized network use. Most of the discussion was among the geeklaw aficionados, and I’ve been kind of surprised that the general library and higher ed crowds haven’t seemed to be following it that closely. The networks most deeply involved in the case are those of JSTOR, the not-for-profit service that hosts & archives hundreds and hundreds of scholarly journals.
And aside from the geeklaw-y librarians I know, what discussion I have seen from academics (and from a lot of nonacademic commentators) has been saying things like “they’re bringing criminal charges against a researcher for downloading too many articles???” “He was a legit user of JSTOR, this is ridiculous!” I do agree that the prospect of jail time for Swartz’ activities (especially when JSTOR itself had apparently considered the matter settled) seems like a massive overreaction on the part of the prosecutors. However, the charges in the indictment, and Swartz’s alleged criminal activities are NOT “downloading too many articles.”
Let’s get some stuff straight.
1. There are NO copyright charges involved.
United States copyright law is codified at Title 17 of the U.S. Code, and criminal copyright infringement is defined at 17 U.S.C. § 506. All of the charges in this case fall under Title 18 of the U.S. code – the federal criminal code – and the specific charges are around wire fraud (18 U.S.C. § 1343) and computer fraud (18 U.S.C. § 1030).
Whether these kinds of actions should be grounds for criminal prosecution, especially when the organizations & institutions that run the networks in question have chosen not to bring civil suit, is a question around which there’s pretty serious debate. Similar charges were brought against Lori Drew, whose harassment and bullying of a young teen on MySpace was a major contributing factor to the teen’s eventual suicide. Prosecutors argued that Drew’s activities violated the MySpace terms of service, and that that alone constituted “unauthorized access” to MySpace, and thus was grounds for prosecution under the Computer Fraud and Abuse Act (the same law that is the basis of several of the charges against Swartz.) Ultimately, a federal judge overturned a jury guilty verdict, questioning the wisdom of allowing website terms of service – which can be defined at the whim of the site owner – to found the basis of criminal charges. Many legal scholars and commentators (full disclosure: I edited that last link) agreed that this was the correct legal outcome (although almost all expressed abhorrence at Drew’s actual activities.) This has not stopped subsequent prosecutions on similar theories – where violations of terms of service are used as the basis of computer fraud “unauthorized access” charges.
2. Campus subscriptions don’t actually confer unlimited access to databases!
Swartz’s initial access to the MIT network was totally legitimate – they offered guest user access to their networks and subscription library resources for up to 14 days. This is pretty generous – a lot of campuses offer much more limited network and subscription resource access to guests, partly because access for more potential users usually costs the campus more. But even his initial access, if the allegations are true, involved running an automated program to query the JSTOR databases and scrape content out of them.
2a. Some of the limitations on use of subscription resources are kinda wacky, but the main one Swartz allegedly violated is pretty straightforward.
Most subscription resources, JSTOR included, prohibit even users with legitimate access from downloading whole issues of individual journals. This is a little wacky, because it’s pretty common for a journal to devote an entire issue to one topic, so it might be really relevant to someone’s research to download a whole issue, or even several whole issues. JSTOR’s policy is more generous than most, explicitly recognizing that sometimes “the entire contents of a journal issue[…] [may be relevant] to a particular research purpose” and allowing larger access under those circumstances.
Even more wackily, most library subscription resources prohibit anything other than “personal use”. As Barbara Fister ably outlined recently, defining what “personal use” is in the process of scholarship is a pretty tricky issue, and there are a lot of activities that a lot of faculty members regularly engage in that might violate these kinds of limitations on use. The height of wacky restrictions that subscription resources impose on legitimate users is probably the Harvard Business Review’s prohibition on linking to articles even from within password-protected campus networks.
But the usage limitation that Swartz is accused of violating is the one against systematic downloading of content using automated software. I just don’t see this limitation as all that wacky. An automated script querying and downloading from a server can impose a really heavy load on that server – spiking use much higher than even a large group of human users. This limitation seems to me like a pretty reasonable tool for service providers to manage and predict their network loads.
2b. “Scholarship shouldn’t be locked up in these ivory-tower, commercialized, locked-down, and restricted databases in the first place!”
Actually, I agree with you, at least on principle. (And, um, JSTOR’s non-profit…) But there are many, many structural factors that contribute to the ongoing set of problems of access to the products of scholarship. We are working on it (oh, wow, are we working!), but the cultural change, it goes slowly. And the fact remains that academic authors have for years been transferring their copyrights to publishers (non-profit and, increasingly, commercial) without much thought. So right now, the copyrights in these articles do, mostly, live in the hands of the publishers. And that’s most directly the result of the authors’ decisions (or lack of awareness that there were decisions to be made). I seriously question that large scale knowing copyright infringement is a completely necessary response, or that even as civil disobedience, such activities are going to accomplish much change.
I also question the allegation that Swartz did all this stuff with the intent to upload all the articles to filesharing sites. He may just have been doing it to see if he could. (Weirdly, he used the guest access at MIT even though he had full access to JSTOR at the time through his fellowship at Harvard.) I don’t know all that much about the guy, but he sounds pretty smart, and I’m fairly sure he would recognize the quite different legal and ethical implications of redistributing works under copyright versus redistributing limited-access public domain materials, as he did in some earlier projects. He really did work on an article doing textual analysis on a large body of scholarly articles, though it’s unclear whether those articles were obtained from the JSTOR scraping. (Although even if his intent really was just to analyze the articles, it’s also unclear whether making whole copies of massive numbers of texts for scholarly analysis without permission, even via unquestionably authorized access, is a fair use under copyright law.)
3. “It’s like he’s being prosecuted for checking out too many books!”
The copying he allegedly did is very little like checking out a large quantity of books (which, incidentally, is totally legit under copyright’s “first sale” doctrine (17 U.S.C. § 109), but may be limited by library policies.) It’s much more like photocopying large quantities of journal articles. And, as I said above, the copying is only tangentially related to the charges (in that that is how he allegedly violated the terms of service of JSTOR.)
Maybe a better comparison story would be this: someone goes to an open-to-the-public library, and starts taking lots of journals off the shelves and photocopying them. The library staff asks this Someone to stop, because he’s making it hard for the other patrons to use the journals, and because he’s copying in such volume that they have some copyright concerns (yeah, yeah, I don’t want libraries to be the copyright police. But the 17 U.S.C. § 108 limitations on libraries’ liability for patron copying don’t really protect libraries from known large-scale questionable use of their resources, and we’re talking some pretty darn large-scale photocopying.) Someone persists in the copying, so much so that the journals are all unusably out of order (JSTOR’s servers allegedly overloaded), the copiers break (MIT’s network allegedly got stressed), and the journal distributors even refuse to deliver new issues until the library does more to stop this Someone’s copying (JSTOR turned off service to the whole MIT campus for multiple days, eventually.) Nevertheless, this Someone still wants to copy, so he breaks in to the library at night to continue going about his business. And no one sues him for copyright infringement, and the distributors and the library let things drop when he finally knocks it off. But the prosecutors step in and bring charges against him for messing up the journals, breaking the copiers, and breaking in to the library.
In that story, it’s still kinda questionable whether criminal charges (and certainly, whether 35 years potential imprisonment) are appropriate. But I think it’s a lot clearer that Someone was doing some pretty questionable things. And maybe thinking about it that way, we can move past “Jail? For downloading too many articles???” and start figuring out what we as individuals and as an international community of scholars can do to open things up so similarly problematic access situations are unimaginable 50 years from now.