Breast Cancer Risk Project

From Wiki

Jump to: navigation, search

A spatial epidemiology study conducted by

ifbc8503s.gif

(Click on above image for full size.)

For information regarding this map contact: Alex Brown (617) 308-9456. See below for details.


Exploratory temporal visualization of Massachusetts breast cancer data archives (PDF)


Case-control studies of lung and breast cancer in Massachusetts' Upper Cape Cod region (Vieira et al, Paulu et al 2002, 2008) showed spatial clustering of risk based on residential location. Statewide Massachusetts cancer incidence mapping by Silent Spring Institute also showed clustering in the greater Boston area and on Cape Cod. To expand the spatial scope of these studies, town-level archives of breast cancer records from the Massachusetts Department of Public Health Cancer Registry, in the Massachusetts Community Health Information Profile (MassCHIP) public database system, were used to explore spatial and temporal patterns in statewide distribution of breast cancer occurence. Problems of data suppression for confidentiality as well as great differences in population among the 351 towns of Massachusetts resulted in large gaps in this data. While only access to complete MassCHIP data could support full analysis of spatial distribution of breast cancer incidence, a synthetic dataset based on MassCHIP but supplemented by estimation of suppressed data was able to support exploration of methods of temporal visualization and cluster identification. Variance instability in this dataset was evaluated using Empirical Bayes rate smoothing. Major clusters of breast cancer incidence in greater Boston and on Cape Cod were confirmed in the smoothed rate data, while possible time-varying clusters observed in western Massachusetts were not.


.

Contents

PHGIS'09 presentation, 6 June 2009

09gis_public_logocompressed.jpg


Breast Cancer Risk Project (FY08) presentation at URISA Public Health GIS Conference, Providence RI June 5-8 2009:


.

Project participants

Investigators:


Research assistant: Alex Brown, Toxics Use Reduction Institute (TURI) at UMass-Lowell and UML Dept of Environmental, Earth & Atmospheric Sciences

Maps, animation graphics, calculator software, and these web pages prepared by Alex Brown -- last updated 23:00, 23 June 2009 (EDT)

Resources, data, and methods

EpiResources

  • Initial study area and period: Upper Cape Cod Cancer Incidence Study area, 1983-1986.
    • "A method for spatial analysis of risk in a population-based case-control study" - Veronica Vieira, Thomas Webster, Ann Aschengrau, David Ozonoff. Int.J. Hyg. Environ. Health 205, 115-120 (2002) (PDF)
    • "Exploring Associations between Residential Location and Breast Cancer Incidence in a Case–Control Study" - Christopher Paulu, Ann Aschengrau, and David Ozonoff. Environmental Health Perspectives Vol 110, No 5 (May 2002) pp. 471-478 (PDF)
  • Silent Spring Institute: MassHEIS - Massachusetts Health and Environmental Information System
    • 2005 presentation on Silent Spring Institute studies on environmental carcinogens
    • About MassHEIS - Silent Spring Institute has developed a web-based, interactive mapping tool, that serves the dual goals of community access to health and environmental information about communities in Massachusetts and researcher access to underlying datasets developed in the Institute’s Cape Cod Breast Cancer and Environment Study and by state, federal, and other nonprofit sources.
    • Supported by National Library of Medicine
    • Uses ArcIMS web mapserver technology - very convenient for ArcGIS exploration, analysis, presentation
  • Massachusetts Community Health Information Profile (MassCHIP)
  • Massachusetts Cancer Registry
    • Responsible for the collection of information regarding all newly diagnosed cases of cancer in Massachusetts.
  • MassBenchmarks (MISER) - Massachusetts demographic statistics based on US Census data - http://www.massbenchmarks.org/statedata/data.htm

Interpolation methods evaluation

To evaluate interpolation methods for sparse aggregate data sets, interpolation methods were first tested using a regional study where both raw and aggregate data were available.

Statewide interpolation of town-level data

MassSIR9502town-s.jpg

In the above map, an inverse root distance weighting was used on town-level aggregated SIR data on town geographic centroids, with a radius of influence of 30km (w(r,t)=r**-t, r = 0..30km, t=0.5) (Source of data: SSI)

  • A range of exponent and radius values was tested; see IDW parameter tests.
  • This map presents interpolated SIR using a color ramp covering two standard deviations.

Metadata for SSI-MassHEIS breast cancer data

Statistical significance of town-level SIR

Breast cancer incidence variation over time

The above studies aggregate records of female breast cancer standardized incidence ratio (SIR) for periods of several years. The pattern of spatial distribution of SIR changes over time, as indicated by maps of Mass Breast Cancer SIR by Towns, 1996-2004 showing SIR by town and similar statewide inverse root distance interpolated SIR by town for these four five year periods. (NOTE: The period 1997-2001 is omitted.)


See these viewers for a sequential view of this short time series:


Source: NoNABCstats9604.xls, thanks to Kathleen Attfield, SSI.


Animations, Mass Breast Cancer SIR by Towns, 1996 - 2004

Massachusetts breast cancer SIR by town, and inverse root distance interpolation with the same parameters as above, is shown for four five year periods in these animations. (NOTE: The period 1997-2001 is omitted.) These animations were prepared by adjusting the green-red color ramp to show relative town SIR values; the second animation below shows interpolation of town SIR values from two standard deviations below to two standard deviations above the mean for each frame. Correspondence between color and value is therefore not consistent from frame to frame, and represents only relative incidence in each interval.


  • 1996to2004ssi-playout-s.gif
  • MassTownsIfbcSir1996to2004layout-s.gif


(Click on above images for full size.)

Relative breast cancer incidence calculation 1985-2003

We have calculated a synthetic SIR dataset based on MassCHIP but supplemented by estimation of suppressed data by year and by town, to develop a more complete time series of interpolated risk surfaces. The result is mapped in an animation of a time series of fifteen maps of invasive female breast cancer estimated incidence by town and by year, at the head of this page, and described below.

ifbc8503s.gif

The above animation of fifteen maps of for fifteen five year moving average intervals covering 1985 to 2003, was prepared by stretching the green-red color ramp from two standard deviations below to two standard deviations above the mean for each map. Correspondence between color and value is therefore not consistent from frame to frame -- color represents relative incidence in each interval.

(Click on above image for full size.)

Source: Breast Cancer Prevention Project / TURI, based on data from MassCHIP and Massachusetts Cancer Registry - Statewide Reports

NOTES:

  1. Individual frames of this animation are available here.
  2. In this time series animation, colors indicate different data values in successive frames. Colors should be interpreted as an indication of relative, not absolute incidence of invasive female breast cancer. An animation showing absolute incidence is available here.
  3. The calculation of this index is based on MassCHIP and Massachusetts Cancer Registry report data, by town and by year, in which low town population (i.e. sample size) has a significant impact on statistical calculations. In particular:
    1. Low counts of incidence in a town, equal to one, two, three, or four, are suppressed for medical records privacy reasons, as described below; the "NA" indicator for suppression is found in nearly half the records in this MassCHIP dataset. An expected value based on statewide incidence reported by DPH in Massachusetts Cancer Registry reports was calculated for such suppressed records, based on estimated population of the age group for that report for that town for that year. Statewide incidence by age group for each year was then recalculated and used to compute an incidence ratio for each town for each year, and, as described, this incidence ratio was then averaged for each town over each five successive years, from 1985 to 2003.
    2. The resulting index is susceptible to exaggeration by low sample sizes, since incidence count is an integer. This may be present in values computed for western Massachusetts in the 1990s.


Standardized Incidence Ratio (SIR): age-adjusted incidence compared to expectation

Annual invasive female breast cancer SIR calculation:

Each annual sheet shows MassCHIP recorded incidence by age groups, including "NA" suppression values, and total incidence for all age groups (with NA values set to 0) in the column labelled "incidence". It does not show expected cases by age group based on annual statewide age group incidences and estimated age group populations for the town and year. These are calculated in processing (below) and accumulated into the "expected" column. Dividing "incidence" by "expected" produces a raw "sir" column. A summary sheet on the far right shows annual raw SIR by town.

See Numerator for details.

Numerator: Invasive female breast cancer incidence by town by age

MassCHIP database of incidence by age groups, including "NA" suppression values as described.

Denominator: Expected incidence

Expected incidence is based on town population by gender and age and statewide incidence rate by age group. Statewide incidence by age group can be calculated from records of incidence by town and by age, but suppression of low case count records complicates the problem.

SIR calculation data and product files

Work products

Spatial statistics

Small sample sizes: Analysis for presence of rate and spatial distribution artifacts

Cluster analysis: Local indicators of spatial autocorrelation

Conclusions

This work shows the value of simple visualization methods in exploratory data analysis for sparse and incomplete data. A synthetic data set composed of recorded and estimated incidence records was used to explore spatial modeling and analysis methods; this data set showed a small number of outliers and anomalies. Rate smoothing was used to correct variance instability, and cluster detection confirmed expected cluster locations in this rate smoothed data. Rate smoothing may introduce spatial autocorrelation into cluster detection, however. Local Moran and Getis-Ord cluster tests confirmed known clusters in the greater Boston region and Cape Cod, but did not confirm unstable clusters in western Massachusetts. More work is necessary to quantify clustering of risk and possible association with environmental factors, using more complete data from Massachusetts breast cancer archives for this period,. Access to this data has been approved and further work is expected to resume when funds are available.


slide0064_image071.jpg

Personal tools