Blog Viewer

The studENT: Publicly Available and Subscription-Based Data Sources


The studENT: Publicly Available and Subscription-Based Data Sources

 By: Melissa R. Medaugh

PhD Student, University of North Carolina at Charlotte

Student Representative, AOM Entrepreneurship Division

We Need Quality Data!

I’ve often heard, “A done dissertation is a good dissertation.” As I endure a labor of love to complete my own dissertation, I’m a believer! We all want to complete our PhD programs and start successful careers as newly-minted professors as soon as possible. Ah, to join the ranks of other entrepreneurship professors and earn a decent salary... Unfortunately, the process of designing and executing a dissertation can create headaches – figuratively and literally – that threaten to slow our progress. But we must overcome!

One problem that plagues entrepreneurship researchers across career levels is access to quality data. Data quality affects our ability to make valid and reliable inferences about phenomena of interest. Accessing enough entrepreneurs or entrepreneurial firms to ensure an appropriate sample is not only difficult, but also may be incredibly time consuming, especially when collecting survey data. Low response rates in survey research are especially problematic, as are convenience samples. Survey studies may be costly, as well. These issues are compounded in process-oriented research, which requires longitudinal data.

Indeed, entrepreneurship scholars have lamented data quality issues for decades. Drawing on his 20 years of scholarship, Deeds (2014) wrote about the field of entrepreneurship research:

We have a hard time answering… basic questions. One of the reasons for this is that it is extremely difficult to design and execute a good empirical study of entrepreneurial ventures… [This has] led to the creation and publication of a great deal of work [that is] based on small, biased, idiosyncratic samples. (p. 10)

If advanced entrepreneurship scholars experience such difficulties, doctoral students working to design and execute dissertations surely do, as well. We need data. Sourcing quality data quickly is thus a matter of great concern and critical importance.

Using Secondary or Archival Data

An alternative to primary data collection is the use of secondary or archival data. Numerous secondary data sources exist that are either publicly available or available with a university subscription. These options are extremely valuable in advancing entrepreneurship scholarship. Researchers gain access to large, often multinational and cross-industry samples at little to no cost. Larger, more diverse samples allow researchers to examine generalizability across contexts and better delineate boundaries in theory application. Some databases include multiple observations of the same ventures/entrepreneurs over time, making process-oriented, longitudinal research more feasible. Secondary data sources also have disadvantages that researchers must weigh. For example, researchers have no control over data collection processes, including choice of construct measures. Choosing secondary data sources may increase the use of less-than-ideal proxies and decrease confidence in the inferences researchers make. Processes may also change over waves of data collection, making historical comparisons difficult.

Despite these drawbacks, dissertations using secondary data sources can produce accurate inferences about entrepreneurship phenomena and may be designed and conducted relatively quickly, shortening the path to dissertation completion. Researchers have access to numerous publicly available databases, databases typically available with university subscription, and other sources of openly available data. The AOM Entrepreneurship Division’s website lists several options: the Kauffman Firm Survey (KFS), Global University Entrepreneurial Spirit Students’ Survey (GUESSS), Global Entrepreneurship Monitor (GEM), Successful Transgenerational Entrepreneurship Practices Project (STEP), County Business Patterns (CBP) database, and Panel Study on Entrepreneurial Dynamics (PSED).

I also conducted a quick Google Scholar search of three top management and entrepreneurship journals to identify additional publicly available data sources that may be helpful. The list provided in the following table is certainly not exhaustive; rather, it is intended to offer a purview of more resources available to help us move our dissertations forward. Happy researching!

Additional Publicly Available or Subscription-Based Data Sources

Data Source


Published Research Exemplars

VentureXpert (now part of Thomson ONE)

venture capital and portfolio firm data, including deals, ownership, fund profiles

Anokhin, Wincent, & Oghazi (2016); Block, De Vries, Schumann, & Sandner (2014); Cumming & Dai (2013); Dutta & Folta (2016)


U.S. Securities and Exchange Commission (SEC)

public companies’ annual proxy statements; corporate financial statements; crowdfunding offerings data; executive compensation; prospectus


Block (2012); Tuggle, Schnatterly, & Johnson (2010)

Securities and Exchange Commission of Brazil (CVM)


Brazil’s equivalent to U.S. SEC

Inoue, Lazzarini, & Musacchio (2013)


accounting data on active and inactive U.S. public companies (1950-present)

Tuggle, Schnatterly, & Johnson (2010); Miller & Breton-Miller (2011); Fernhaber & Li (2010);



Executive compensation data (salary, bonuses, stock options) and firm financial data for S&P 1000 firms (1992-present)


Engelen, Neumann, & Schwens (2015); Martin, Gómez‐Mejía, Berrone, & Makri (2017)


crowdfunding data

Allison, Davis, Webb, & Short (2017); Butticè, Colombo, & Wright (2017); Kuppuswamy & Bayus (2017); Skirnevskiy, Bendig, & Brettel (2017)



UK crowdfunding data


Vismara (2016)

Kiva API

crowdfunding loan/microfinance data


Moss, Neubaum, & Meyskens (2015)

Compact Disclosure

data on publicly traded companies, compiled from SEC filings


Miller & Breton-Miller (2011)



public company

Miller & Breton-Miller (2011)

Center for Research on Security Prices of the University of Chicago (CRSP)


market performance data

Miller & Breton-Miller (2011)

RAMS and LISA from Statistics Sweden


annual data on all Swedish firms

Bird & Wennberg (2014)

Corporate Library (Board Analyst)

shareholder proposals; corporate board structure; executive and director compensation; publicly traded companies


Martin, Gómez‐Mejía, Berrone, & Makri (2017)

Bel-First database of Bureau Van Dijk

financial data for companies in Belgium and Luxembourg; data on board of directors; size (employees); employee turnover


Molly, Laveren, & Jorissen (2012)

Chinese National Bureau of Statistics (NBS)

annual accounting reports filed by Chinese industrial firm


Du, Guariglia, & Newman (2015)

United States Federal Reserve

bank market competitiveness


Saparito, Elam, & Brush (2013)

Centre for Monitoring Indian Economy’s Prowess database

financial performance and marketing data for over 27,000 Indian public and private firms


Chen, Chittoor, & Vissa (2015)

Lexis-Nexis Academic Universe


News/media releases

Mouri, Sarkar, & Frye (2012)

Press/media outlets



Jiang & Ruling (2017)

Social media: Twitter, Facebook. Blogs, LinkedIn, YouTube



Fischer & Reuber (2014)

Company web sites

e.g., mission statements



U.S. Census Bureau

self-employment data; business dynamics


Lofstrom, Bates, & Parker (2014)

U.S. Census' County Business Patterns datasets (CBP)

cross-section of annual subnational economic data on U.S. companies since 1986, including number of establishments, payroll, and weekly employment; available by geographical area, industry, legal organizational form, and designated employee size; downloadable in .csv format


Plummer & Acs (2014)

U.S. Patent and Trademark Office

regional patents granted in industrialized and developing nations

Plummer & Acs (2014)

European Patent Office (EPO) Worldwide Patent Statistical Database (PATSTAT)


patent application data

Block, De Vries, Schumann, & Sandner (2014); Fischer & Ringler (2014)

World Bank Enterprise Survey

business environment data collected from owners and top managers of businesses with at least 5 employees, in numerous countries: access to finance, corruption, innovation, competition, performance


Williams, Martinez‐Perez, & Kedir (2017)

National Science Foundation


federal research and development expenditures


Plummer & Acs (2014)

Comprehensive Australian Study of Entrepreneurial

Emergence (CAUSEE)

four-year longitudinal data on 625 new and 559 young firms

Crawford, Aguinis, Lichtenstein, Davidsson, & McKelvey (2015)


1 comment




05-12-2018 14:05

Thanks Melissa for sharing your insights and for this deep dive into available databases!