Back To Blog

Data Work Makes Data Work: Laying the Foundation for Analysis

May 12, 2022

Agencies have a lot of data. Analyzing this data can provide valuable information to the agency, but often the agency does not have staff that are experienced in doing data analysis. In this article, we’ll cover how your agency can tackle data analysis with the right partner and mindset.

The very first step of the analysis is doing the data work. This is often short-cut or ignored altogether by an inexperienced analyst. There are several reasons for this:

  • Lack of knowledge of how important the data work is for a proper analysis.
  • Assuming the data is sound.
  • Being too eager to do the analysis work.
  • To most, it is boring and tedious work.

It sounds like a cliché, but an analysis breakdown normally follows the 80/20 rule, where 80% of the time will be doing the data work and 20% of the time will be doing the actual analysis. Since most find the data work tedious, this 80/20 breakdown turns many analysts off. The biggest mistake for an analyst is jumping into data analysis without doing the required data work analysis and corrections.

The above ratio can be reduced if the agency takes the time to do an overall data clean-up project and afterwards doing regular data integrity work right after new data is brought into your database. For Department of Transportation (DOT) agencies that have bid lettings once or twice a month, this is very manageable; however, for other agencies, like a city that have new data entering its database every day, the best solution would be to perform regular data cleanup sessions (e.g., once a month to minimize the amount of data to be evaluated and cleaned-up). If this advice is followed, an analyst can proceed to analysis work much quicker. Unfortunately, very few agencies have the foresight to allocate time and resources for this important task.

Since our founding in 1977, a large part of Infotech’s business is with DOT agencies. Many DOT agencies employ Infotech to do data clean-up work or data analysis for the agency. Even for an agency that may be doing regular data integrity work, Infotech always checks the data for soundness, as all analysts should (never assume the data is good and sound). For agencies that are regularly ensuring its data is correct, this check normally goes rather quick; however, all too often, agencies don’t perform data integrity, and a large part of Infotech’s time is plowing through this data and making corrections (80/20 rule).

Different types of analysis will require different data work. For an example, an analysis of asphalt paving contracts would require having sound data on project locations and facilities (asphalt plants) location among other things. However, an analysis of highway guardrail contracts is much less dependent on location. The reason for this difference is that a commodity like asphalt can only be hauled a certain distance and for a certain time. Guardrail work is highly mobile and not tied to plant and project locations.

Since a lot of Infotech’s market analysis is for a DOT agency on asphalt paving (the largest commodity of the agency’s program), the below summarizes the major aspects of data work that needs to be performed:

  • Classification of Items into Functional Classes.
  • Classification of Contracts into Functional Classes.
  • Identification of Affiliated Vendors.
  • Identification of Vendor Asphalt Facilities (plants) and Coordinates.
  • Identifying Work Type Market Areas.
  • Vetting Contract Locations (coordinates).

In the above, classification of items and contracts into function classes means classifications that most accurately describe the type of work. That is, items are classed to describe the functional nature of the item such as asphalt paving, concrete paving, earthwork, etc. Most DOT agencies do a good job of classing items into functional groups, but do not do a good job of classing contracts. In most cases, DOT agencies class contracts into “program types” such as resurfacing, highway maintenance, etc. The problem with this is “resurfacing” may be either asphalt paving or concrete paving, similarly highway maintenance may be asphalt paving, concrete paving or even guardrail work. We correct contract classifications by summing the item class dollars (after these have been vetted) and them class contracts accordingly.

Affiliated vendors must be identified before analysis starts. Common forms of affiliated vendors are:

  • Vendors that are subsidiaries of other firms.
  • Vendors that share officers with other firms.
  • Vendors with the same address of another firm.
  • Vendors that have the same phone number or fax number with another firm.
  • Vendors that have an e-mail address tied to the same construction firm.

The above often is missing in DOT data. Fortunately for DOTs that license AASHTOWare Project BAMS/DSS™ (BAMS/DSS), there are data tables (the VENDOR, VENDADDR, and VENDOFFR tables) to evaluate this. Non-DOT Agencies may also have similar data. Vendor web pages are also helpful and doing the analysis of vendor’s facilities also often provide additional information.

We identify affiliated vendors and “collapse” them into one vendor identity to avoid evaluating affiliated vendors as different vendors.

Because the asphalt commodity can only be hauled a certain distance and time, it is critical that exact locations (coordinates) of facilities and contract locations be identified.

Identifying work type markets involves grouping a states’ counties into groups of counties where a significant amount of the same vendors bid in. The classifications of items and contracts, identification of affiliated vendors, and identification of vendor facilities must be done before defining market areas. This defines market analysis for different areas of the state.

As stated earlier, different analysis requires different data work. An asphalt paving market analysis is more complicated than say a guardrail market analysis. A guardrail market analysis will probably only involve the following:

  • Classification of Items into Functional Classes.
  • Classification of Contracts into Functional Classes.
  • Identification of Affiliated Vendors.

This is because guardrail is highly mobile, and less dependent on contract locations. Due to this the entire state may be one market area.

I often give many presentations on market analysis, but I always stress the importance of doing the up-front data work analysis. We also stress this in Infotech’s annual Market Analysis Training (MAT). If you are a DOT agency doing a market data analysis, it is unknown at the beginning if your analysis will ultimately become a formal lawsuit. The best advice is to assume that such an analysis may move into a lawsuit phase. If so, one of the first things the defendants’ lawyers will attack is the integrity of the data. This is because if they can poke holes in the data, they will cast doubt on the data analysis. In the recent West Virginia Asphalt Paving market analysis that did result in a lawsuit, I did take the trouble to ensure the data was sound. I was pleased that late in the case, the lawyers admitted that they didn’t question the data. Fortunately, the case also had compelling analysis and resulted in a $101.35 million settlement – the largest anti-trust settlement in West Virginia history. If the data is sound, your resulting data analysis has a much higher probability of being sound.

Even if you are only doing a routine data analysis for internal or external requests, it is very important that the data of your analysis is sound to maintain credibility.

As said before, the data work is tedious, but I find it interesting particularly when you can identify previously unknown affiliates or contract locations that make no sense to the vendors bidding behavior, etc. This is like a “treasure hunt” - often I laugh out loud when I find this previously unknown data.

The data work is very important before jumping into data analysis. Don’t make the common mistake of short cutting this step, if so, you likely will discover “data holes” during the data analysis that will result in time-consuming re-work.

If your agency needs help in this type of work, contact Infotech at our email at


Jeff Derrer
Data Services Consultant
Jeff Derrer has more than 30 years of experience in transportation project management and engineering. He has been a Senior Consultant with Infotech since 2001, providing data analysis services to state transportation agencies. He previously worked for the North Carolina Department of Transportation as Head of the Contract Monitoring Section for more than 10 years, and for the Virginia Department of Transportation as Head of the Antitrust Section for four years. In those roles, he directed activities that analyzed contractor’s bids to ensure a competitive market place.
infotech logo