NBER US Patent Citation Database

Detailed information on almost 3 million U.S. patents granted between January 1963 and December 1999, all citations made to these patents between 1975 and 1999 (over 16 million), and a reasonably broad match of patents to Compustat (the data set of all firms traded in the U.S. stock market).

Update: The NBER is working on a major NSF-funded update and extension of this data. A new release of these files, bringing existing data up to date through December 2004, is anticipated for 2010 or 2011. A variety of additional fields and indexes will also be provided. These are anticipated to include "link-out" tables connecting patent numbers to geographic entities (e.g. SMSAs), and a codification of inventor names.

Data

These data are described in detail in

Hall, B. H., A. B. Jaffe, and M. Tratjenberg (2001). "The NBER Patent Citation Data File: Lessons, Insights and Methodological Tools." NBER Working Paper 8498.

ALL USERS OF THESE DATA SHOULD READ THIS PAPER, AND SHOULD CITE IT AS THE SOURCE OF THE DATA

...

The data are freely available below in two compressed (".zip") formats: SAS transport (.tpt) files and ASCII comma-separated variable (.csv) files. The program read_tpt.sas can be used to convert the .tpt files to native SAS data sets. Lines in the ASCII CSV files are terminated by the newline character "\n". "CSV" stands for comma separated values. All values in the ASCII CSV files are separated by commas. In addition, the character values are enclosed by double quotes. The compression ratio for the compressed files is about 75%. The ".zip" files can be uncompressed with winzip or pkunzip. To check your ability to uncompress these files, download the small file compress.zip. The SAS ".tpt" files are transferable to other formats using software such as Stat/Transfer or DBMS/Copy

Datasets

Description Documentation Data -- Pkzipped

SAS .tpt ASCII CSV

Overview overview.txt --

Pairwise citations data Cite75_99.txt Cite75_99.zip -- (68 Mb) acite75_99.zip -- (82 Mb)

Patent data, including constructed variables pat63_99.txt pat63_99.zip -- (90Mb) apat63_99.zip -- (56Mb)

Assignee names coname.txt coname.zip -- (2Mb) aconame.zip -- (2Mb)

Contains the match to CUSIP numbers match.txt match.zip -- (130Kb) amatch.zip -- (98Kb)

Individual inventor records inventor.txt inventor.zip -- (98Mb) ainventor.zip -- (82Mb)

Class codes with corresponding class names classes.txt --

Country codes with corresponding country names countries.txt

Class, technological category, and technological subcategory crosswalk class_match.txt

Technological category and subcategory labels subcategory.txt -- subcategory.csv

SAS program to convert .tpt files to native SAS format

read_tpt.sas

U.S. Patent Classification (USPC) System and the Standard Industrial Code (SIC) System

Openness: OPEN

  • License: OK. (PD)
  • Access: yes. Good docs and open formats.
    • bulk: yes.

Data and Resources

Additional Info

Field Value
Source http://www.nber.org/patents/
Author Hall, B. H., A. B. Jaffe, and M. Trajtenberg
Version 2001
Last Updated October 10, 2013, 23:16 (UTC)
Created April 12, 2007, 16:16 (UTC)
comments powered by Disqus
comments powered by Disqus