The studENT: Publicly Available and Subscription-Based Data Sources

  

The studENT: Publicly Available and Subscription-Based Data Sources

 By: Melissa R. Medaugh

PhD Student, University of North Carolina at Charlotte

Student Representative, AOM Entrepreneurship Division

We Need Quality Data!

I’ve often heard, “A done dissertation is a good dissertation.” As I endure a labor of love to complete my own dissertation, I’m a believer! We all want to complete our PhD programs and start successful careers as newly-minted professors as soon as possible. Ah, to join the ranks of other entrepreneurship professors and earn a decent salary... Unfortunately, the process of designing and executing a dissertation can create headaches – figuratively and literally – that threaten to slow our progress. But we must overcome!

One problem that plagues entrepreneurship researchers across career levels is access to quality data. Data quality affects our ability to make valid and reliable inferences about phenomena of interest. Accessing enough entrepreneurs or entrepreneurial firms to ensure an appropriate sample is not only difficult, but also may be incredibly time consuming, especially when collecting survey data. Low response rates in survey research are especially problematic, as are convenience samples. Survey studies may be costly, as well. These issues are compounded in process-oriented research, which requires longitudinal data.

Indeed, entrepreneurship scholars have lamented data quality issues for decades. Drawing on his 20 years of scholarship, Deeds (2014) wrote about the field of entrepreneurship research:

We have a hard time answering… basic questions. One of the reasons for this is that it is extremely difficult to design and execute a good empirical study of entrepreneurial ventures… [This has] led to the creation and publication of a great deal of work [that is] based on small, biased, idiosyncratic samples. (p. 10)

If advanced entrepreneurship scholars experience such difficulties, doctoral students working to design and execute dissertations surely do, as well. We need data. Sourcing quality data quickly is thus a matter of great concern and critical importance.


Using Secondary or Archival Data

An alternative to primary data collection is the use of secondary or archival data. Numerous secondary data sources exist that are either publicly available or available with a university subscription. These options are extremely valuable in advancing entrepreneurship scholarship. Researchers gain access to large, often multinational and cross-industry samples at little to no cost. Larger, more diverse samples allow researchers to examine generalizability across contexts and better delineate boundaries in theory application. Some databases include multiple observations of the same ventures/entrepreneurs over time, making process-oriented, longitudinal research more feasible. Secondary data sources also have disadvantages that researchers must weigh. For example, researchers have no control over data collection processes, including choice of construct measures. Choosing secondary data sources may increase the use of less-than-ideal proxies and decrease confidence in the inferences researchers make. Processes may also change over waves of data collection, making historical comparisons difficult.

Despite these drawbacks, dissertations using secondary data sources can produce accurate inferences about entrepreneurship phenomena and may be designed and conducted relatively quickly, shortening the path to dissertation completion. Researchers have access to numerous publicly available databases, databases typically available with university subscription, and other sources of openly available data. The AOM Entrepreneurship Division’s website lists several options: the Kauffman Firm Survey (KFS), Global University Entrepreneurial Spirit Students’ Survey (GUESSS), Global Entrepreneurship Monitor (GEM), Successful Transgenerational Entrepreneurship Practices Project (STEP), County Business Patterns (CBP) database, and Panel Study on Entrepreneurial Dynamics (PSED).

I also conducted a quick Google Scholar search of three top management and entrepreneurship journals to identify additional publicly available data sources that may be helpful. The list provided in the following table is certainly not exhaustive; rather, it is intended to offer a purview of more resources available to help us move our dissertations forward. Happy researching!


Additional Publicly Available or Subscription-Based Data Sources

Data Source

Description

Published Research Exemplars

VentureXpert (now part of Thomson ONE)

venture capital and portfolio firm data, including deals, ownership, fund profiles



Anokhin, Wincent, & Oghazi (2016); Block, De Vries, Schumann, & Sandner (2014); Cumming & Dai (2013); Dutta & Folta (2016)

 

U.S. Securities and Exchange Commission (SEC)




public companies’ annual proxy statements; corporate financial statements; crowdfunding offerings data; executive compensation; prospectus

 

Block (2012); Tuggle, Schnatterly, & Johnson (2010)



Securities and Exchange Commission of Brazil (CVM)

 

Brazil’s equivalent to U.S. SEC

Inoue, Lazzarini, & Musacchio (2013)

Compustat

accounting data on active and inactive U.S. public companies (1950-present)



Tuggle, Schnatterly, & Johnson (2010); Miller & Breton-Miller (2011); Fernhaber & Li (2010);

 

ExecuComp



Executive compensation data (salary, bonuses, stock options) and firm financial data for S&P 1000 firms (1992-present)

 

Engelen, Neumann, & Schwens (2015); Martin, Gómez‐Mejía, Berrone, & Makri (2017)

Kickstarter

crowdfunding data



Allison, Davis, Webb, & Short (2017); Butticè, Colombo, & Wright (2017); Kuppuswamy & Bayus (2017); Skirnevskiy, Bendig, & Brettel (2017)

 

Crowdcube


UK crowdfunding data

 

Vismara (2016)

Kiva API


crowdfunding loan/microfinance data

 

Moss, Neubaum, & Meyskens (2015)

Compact Disclosure



data on publicly traded companies, compiled from SEC filings

 

Miller & Breton-Miller (2011)



Hoover’s

 

public company

Miller & Breton-Miller (2011)





Center for Research on Security Prices of the University of Chicago (CRSP)

 

market performance data

Miller & Breton-Miller (2011)




RAMS and LISA from Statistics Sweden

 

annual data on all Swedish firms

Bird & Wennberg (2014)


Corporate Library (Board Analyst)




shareholder proposals; corporate board structure; executive and director compensation; publicly traded companies

 

Martin, Gómez‐Mejía, Berrone, & Makri (2017)


Bel-First database of Bureau Van Dijk





financial data for companies in Belgium and Luxembourg; data on board of directors; size (employees); employee turnover

 

Molly, Laveren, & Jorissen (2012)

Chinese National Bureau of Statistics (NBS)


annual accounting reports filed by Chinese industrial firm

 

Du, Guariglia, & Newman (2015)

United States Federal Reserve



bank market competitiveness

 

Saparito, Elam, & Brush (2013)

Centre for Monitoring Indian Economy’s Prowess database



financial performance and marketing data for over 27,000 Indian public and private firms

 

Chen, Chittoor, & Vissa (2015)


Lexis-Nexis Academic Universe

 

News/media releases

Mouri, Sarkar, & Frye (2012)



Press/media outlets

 

 

Jiang & Ruling (2017)




Social media: Twitter, Facebook. Blogs, LinkedIn, YouTube

 

 

Fischer & Reuber (2014)

Company web sites


e.g., mission statements

 

 

U.S. Census Bureau


self-employment data; business dynamics

 

Lofstrom, Bates, & Parker (2014)

U.S. Census' County Business Patterns datasets (CBP)








cross-section of annual subnational economic data on U.S. companies since 1986, including number of establishments, payroll, and weekly employment; available by geographical area, industry, legal organizational form, and designated employee size; downloadable in .csv format

 

Plummer & Acs (2014)

U.S. Patent and Trademark Office

regional patents granted in industrialized and developing nations

Plummer & Acs (2014)





European Patent Office (EPO) Worldwide Patent Statistical Database (PATSTAT)

 

patent application data

Block, De Vries, Schumann, & Sandner (2014); Fischer & Ringler (2014)

World Bank Enterprise Survey






business environment data collected from owners and top managers of businesses with at least 5 employees, in numerous countries: access to finance, corruption, innovation, competition, performance

 

Williams, Martinez‐Perez, & Kedir (2017)

National Science Foundation

 


federal research and development expenditures

 

Plummer & Acs (2014)



Comprehensive Australian Study of Entrepreneurial

Emergence (CAUSEE)

four-year longitudinal data on 625 new and 559 young firms

Crawford, Aguinis, Lichtenstein, Davidsson, & McKelvey (2015)

 

1 comment
67 views

Permalink

Tag

Comments

05-12-2018 14:05

Thanks Melissa for sharing your insights and for this deep dive into available databases!