I decided to choose the “Listing of Active Businesses” dataset. These businesses are currently registered with the Office of Finance, and they are primarily sorted by location which include zip code, street address, mailing address, and geographical coordinates; there are 497,000 rows total and 16 columns. The other classification groups include location start date, council district, NAICS, and NAICS description. NAICS, which stands for North American Industry Classification System, is a self-assigned system that provides specific insight to the type of business establishment.
Those who will find the data most useful and illuminating are ones who can create meaningful insights out of the location, start date, and industry categories of the business. This data set seems to primarily focus on location as the user is able to filter very specifically for it; for example, someone who is confused between the mailing address and actual business location for a particular business can compare the two. Perhaps a greater phenomenon that can be described is what type of businesses are the most popular in Los Angeles (or other nearby cities like Beverly Hills and Torrance), which seem to be real estate, food, and educational services. One can narrow their queries even further and discover which industry/type of business was the most frequently started during a specific decade in history, using the location start date filter.
Those who want to start a businesses would find this dataset particularly useful; I could imagine someone who would want to start a financial services company on Olympic Boulevard doing their competitor research to see if there are any other financial services companies nearby before setting up shop. However, one issue perhaps is that just the NAICS description alone is too one-sided; it does not capture a well-rounded aspect of an entire business operation such as one’s customer segments, client profiles, and way of branding. If there are other financial services nearby and the prospective entrepreneur was only limited to that specific location, it would be valuable to know how they could best differentiate themselves.
Factors left out include financials and profitability, number of periods of inactivity, business entity such as LLC, and personal information about the founders. If I were to start over with data collection to describe a totally different phenomenon catering to prospective employees seeking jobs at businesses, the business data I would want to collect include number of employees, financial statements, number of jobs available, and employee ratings. Perhaps to give a more humanistic dimension to the data, this set would include related media such as photos and videos of the establishment as well as newspaper articles (and to add a historical dimension, these media pieces would be collected over time).