Occupational Safety and Health Administration (OSHA) is an agency of the United States Department of Labor that keeps track of work safety issues. In particular they document work related fatalities. I took OSHA’s blurbs for each workplace death that occurred between October 1st, 2015 and September 30th 2016, filtered out less meaningful words and formed a social network based on the strength of the terms’ co-occurrence. Specifically, two terms have a stronger co-occurrence if there is a high likelihood that they appear together in the same fatality blurb. Each dot is a term associated with fatalities. You can hover your mouse over the dot to see the term and drag the dots around to get a better view. The thickness of the lines between terms is proportional to the strength of their association.
- When a death involves a crane it often involves tipping.
- Mowing lawns seem strongly associated with drowning.
- When a death involves crushing it frequently involves an elevator.
- Deaths by collapse often take place in trenches.
- There is some association between cleaning and death by asphyxiation.
Let be the number of times term appears in the same fatality blurb as term and is the total number of times the term appears. This is the co-occurrence matrix. Now we form the strength matrix.
The strength of two terms can be interpreted probabilistic way. It is the probability of choosing term out of all the fatalities that contain term and then choosing term out of all the fatalities that contain term . If two terms have a strength greater than then they have a line between them. The thickness of the line is proportional to the strength of the link.
OSHA’s data can be found here.
I used Python3 to clean the data and D3.js for the visualization.