OdontoSearch 3.2 Beta

Datasets

Information about the Datasets Used with OdontoSearch 3.2

The three distinct datasets utilized as the core of OdontoSearch correspond to different U.S. populations, one composed of a military population and two derived from the civilian population. The datasets are referred to as: TSCOHS (Military), BigMouth (Civilian), and NHANES (Civilian). The data from these studies have been pooled together for use with OdontoSearch 3.2. All datasets are composed of anonymized dental records of United States residents over the age of 14. In total, the OdontoSearch 3.2 dataset consists of 107,002 individuals.

Customized searches can be completed based on age, sex, and ancestry. Ancestry data were not specified for some of the records in the BigMouth dataset, but age and sex are available for almost every record within the OdontoSearch program. Users have to ability to query individual datasets or to pool all of the records.

TSCOHS Data (Military):


The dental health data representing the military population was graciously provided by the Tri-Service Center for Oral Health Studies, which is affiliated with the Uniform Services University of the Health Sciences, Bethesda, Maryland. The raw data from this source were part of an ongoing study observing dental health throughout the active duty and recruit population of the U.S. military. The data were collected in 1994 and 2000 as part of two phases of the Tri-Service Comprehensive Oral Health Survey (TSCOHS). The 1994 data are composed of detailed dental conditions of active duty and recruits from all branches of the service and from different military installations across the continental U.S. The year 2000 phase of TSCOHS considered all branches of the military, but only in regard to recruits. Because the 2000 data only included recruits, the combined TSCOHS dataset is biased towards the recruit population as opposed to active duty. Individuals in the study ranged from 17-69 years of age. Additional information regarding TSCOHS can be found at their website (http://www.usuhs.mil/tscohs).

Sample Size and Demographic Composition of the TSCOHS Data


(N=19,406)

Age White Black Unknown/Other
Male Female Unknown Male Female Unknown Male Female Unknown
14-16 0 0 0 0 0 0 0 0 0
17-20 3004 663 0 787 280 0 646 157 0
21-30 5288 853 0 1351 344 0 797 127 0
31-40 2730 286 0 675 110 0 339 31 0
41-50 618 64 0 112 22 0 84 9 0
51-60 22 1 0 1 0 0 4 0 0
61-70 1 0 0 0 0 0 0 0 0
71-80 0 0 0 0 0 0 0 0 0
81-90 0 0 0 0 0 0 0 0 0
Total 11663 1867 0 2926 756 0 1870 324 0

 

BigMouth Data (Civilian):


The dental health data representing a major portion of the OdontoSearch reference population was graciously provided by the BigMouth Dental Data Repository. BigMouth is composed of electronic health records contributed by numerous US dental schools that are part of the Consortium for Oral Health Research and Informatics (COHRI) for research purposes (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4215035/). The total BigMouth database holds dental information on over 2 million patients, a subset of which has been incorporated into OdontoSearch. As the BigMouth dataset contains information on a range of individuals from infants to the elderly, a subset of de-identified data was extracted for OdontoSearch that consists of individuals from 14-90 years of age.


Sample Size and Demographic Composition of the BigMouth Data


(N=43,033)

Age White Black Unknown/Other
Male Female Unknown Male Female Unknown Male Female Unknown
14-16 17 18 0 6 9 0 96 128 5
17-20 31 26 1 11 15 0 258 292 4
21-30 203 318 5 51 127 3 1784 2426 178
31-40 218 265 3 95 167 2 2654 4064 452
41-50 231 316 2 80 176 4 2398 3805 92
51-60 338 410 4 149 301 5 2628 4265 108
61-70 403 440 1 180 301 2 2566 3845 161
71-80 199 232 0 85 127 2 1607 1933 35
81-90 56 91 1 19 40 0 658 787 18
Total 1696 2116 17 676 1263 18 14649 21545 1053

 

NHANES Data (Civilian):


The National Health and Nutrition Examination Survey (NHANES) dental data are periodically collected as part of an initiative by the Centers for Disease Control and Prevention to study dental health. Participants in the various NHANES studies represent a cross-section of the US civilian population. Dental data from several NHANES studies were consolidated covering the years from 1988 through 2014. The NHANES study is a multifaceted health examination survey that was conducted in various locations across the United States to collect data on the civilian, non-institutionalized population. Dental health information represents only a single facet of the overall study. For adults, only data for 28 permanent teeth were collected (3rd molars excluded). These data are available to the general public for research purposes via their website (http://www.cdc.gov/nchs/about/major/nhanes/datalink.htm). As the NHANES dataset contains information on a range of individuals from infants to the elderly, a subset of data was extracted for OdontoSearch that consists of individuals from 14-90 years of age


Sample Size and Demographic Composition of the NHANES Data


(N=44,563)

Age White Black Unknown/Other
Male Female Unknown Male Female Unknown Male Female Unknown
14-16 704 786 0 747 679 0 712 751 0
17-20 886 920 0 740 778 0 756 784 0
21-30 1704 1985 0 852 1035 0 763 853 0
31-40 1611 1897 0 838 1039 0 650 732 0
41-50 1419 1476 0 698 808 0 596 619 0
51-60 1291 1334 0 557 613 0 455 464 0
61-70 1412 1440 0 614 638 0 510 545 0
71-80 1370 1423 0 370 379 0 283 282 0
81-90 714 758 0 69 110 0 51 63 0
Total 11111 12019 0 5485 6079 0 4776 5093 0

 

Combined Civilian Data (BigMouth and NHANES)

The dental health data representing a combination of private (BigMouth) and public (NHANES) civilian dental data that consists of individuals from 14-90 years of age. See the corresponding sections for information related to these samples.

Sample Size and Demographic Composition of Combined Civilian Data (BigMouth and NHANES)


(N=87,596)

Age White Black Unknown/Other
Male Female Unknown Male Female Unknown Male Female Unknown
14-16 721 804 0 753 688 0 808 879 5
17-20 917 946 1 751 793 0 1014 1076 4
21-30 1907 2303 5 903 1162 3 2547 3279 178
31-40 1829 2162 3 933 1206 2 3304 4796 452
41-50 1650 1792 2 778 984 4 2994 4424 92
51-60 1629 1744 4 706 914 5 3083 4729 108
61-70 1815 1880 1 794 939 2 3076 4390 161
71-80 1569 1655 0 455 506 2 1890 2215 35
81-90 770 849 1 88 150 0 709 850 18
Total 12807 14135 17 6161 7342 18 19425 26638 1053

 

Combined Data (TSCOHS, BigMouth and NHANES):

This dataset is simply a compilation of all of the data from the TSCOHS, BigMouth and the NHANES data. See the corresponding sections for information related to these samples. As this dataset represents a large sample of the contemporary U.S. military and civilian population, it should be very useful for calculating frequency values related to modern forensic cases. The combined data format includes a total sample size of 107,002 individuals.


Sample Size and Demographic Composition of Combined Data (TSCOHS, NHANES and BigMouth Data)


>

(N=107,002)

Age White Black Unknown/Other
Male Female Unknown Male Female Unknown Male Female Unknown
14-16 721 804 0 753 688 0 808 879 5
17-20 3921 1609 1 1538 1073 0 1660 1233 4
21-30 7195 3156 5 2254 1506 3 3344 3406 178
31-40 4559 2448 3 1608 1316 2 3643 4827 452
41-50 2268 1856 2 890 1006 4 3078 4433 92
51-60 1651 1745 4 707 914 5 3087 4729 108
61-70 1816 1880 1 794 939 2 3076 4390 161
71-80 1569 1655 0 455 506 2 1890 2215 35
81-90 770 849 1 88 150 0 709 850 18
Total 24470 16002 17 9087 8098 18 21295 26962 1053