Data Collection, Modeling


We are a H1B Data Company. Data Quality is our top priority !

How do we collect H1B Data ? Official Data Sources ?

Our H1B Data is from all Official US Government Sources, where the H1B related data is disclosed as part ofpublic disclosure requirements set for these US Federal agencies. Below are some of the data sources that we use.

  • US Department of Labor : H1BLabor Conditional Application(LCA) Disclosure Data, PERM Program Disclosure Data that is part of US GreenCard Process, Prevailing Wage Data, etc.
  • USCIS : H1B Approvals Data by Company, H1B DataProcessing Quarterly Trends, Historical Reports
  • Public disclosure data for general information like CensusBureau


What is the volume of H1B Data used for Intelligent Insights ?

We processed about 6 million data records to give intelligent insights. Most of these data records have about 135 columns, you can just imagine the volume of data. In total it is close to 800 million cells of data !


How do we ensure Data Quality ?

We use advanced data cleaning mechanisms to ensure data quality. Most of the data we receive from official government sources is not clean and requires a signficant data clean up. We apply a varitey a data cleanup algorithms to ensure data quality. Our team also does manual data review and clean up, where automated algorithms cannot provide quality data. H1B Data quality and integrity is our top priority !


How do we do Data Modeling ?

We use a variety of Quantitative Modeling techniques to analyse H1B data and present insights regarding H1B companies and job titles. We use our proprietary algorithms to grade H1B comapnies on various factors. We also present our confidenece score for the grade based on data avaiabliity for trustworthiness of the grade.