Accessibility Skip to Top Navigation Skip to Main Content Home  |  Change Text Size  |  Contact IRS  |  About IRS  |  Site Map  |  Español  |  Help  
magnifying glass
Advanced Search   Search Tips

Statistics of Income Distributed Processing System (SOI-DPS)

 

Privacy Impact Assessment - Statistics of Income Distributed Processing System (SOI-DPS)

SOI-DPS System Overview

The IRS Statistics of Income (SOI) program fulfills a statutory requirement under IRC 6108 to compile and publish timely statistical data based on tax returns, including individual, corporate, partnership, exempt organization, and estate, as well as returns related to foreign activities.  In addition to internal use by the Service for research purposes, the Treasury Department’s Office of Tax Analysis (OTA), the Congressional Joint Committee on Taxation (JCT), the Commerce Department’s Bureau of Economic Analysis (BEA), and other government agencies and private sector organizations use these data.  The SOI Distributed Processing System (SOI-DPS) is an online geographically distributed processing environment. It is dependent on a combination of mainframe (for "residual" processing) and midrange computers, workstations, microcomputers, and thin clients and is essential to the success of the SOI program. The system has been designed, developed, and implemented with a goal of "plug compatibility" with other modernization efforts underway in the Service.

System of Records Number(s) 

Treasury/IRS 70.001, Individual Income Tax Returns, Statistics of Income
Treasury/IRS 34.037, IRS Audit Trail and Security Records System

Data in the System

1. Describe the information (data elements and fields) available in the system in the following categories:

A. Taxpayer
B. Employee
C. Audit Trail Information (including employee log-in info)
D. Other (Describe)

* Taxpayer:  Data are extracted from almost every line of the returns designated for Statistics of Income (SOI) studies.  The data are perfected and become part of microdata files on which the SOI estimates are based.  

* Employee:  It is possible that as taxpayers, some employees’ returns may meet the criteria for inclusion in SOI samples.

* Audit Trail Information:  The SAs /DBAs make periodic activity log file checks to ensure normal system functioning and to watch for warning messages that might indicate a problem.  The activity log contains:

* Logon and logoff of all users by USERID
* Password change including USERID
* Date and time of event
* Success or failure of the event
* All actions by System and Security Administrators
* All actions by Database Administrators (DBAs)
* Other:  Only samples of taxpayer data are input to the system.

2. Describe/identify which data elements are obtained from files, databases, individuals, or any other sources.

A. IRS
B. Taxpayer
C. Employee
D. Other Federal Agencies (List agency)
E. State and Local Agencies (List agency)
F. Other third party sources (Describe)

* IRS:  Returns selected for SOI are identified at the Martinsburg Computing Center (MCC). 

* Taxpayer:  Data are extracted from the Individual and Business Master File systems for sampled returns representing virtually every type of tax and information return filed with the Service.  Paper returns or images of scanned returns may be used in this process. The data elements for any given SOI study are defined in the processing documents maintained by the SOI program branches.

* Employee:  It is possible that as taxpayers, some employees’ returns may meet the criteria for inclusion in SOI samples.

* Federal Agencies: None.  SOI-DPS does not receive data from Federal Agencies.

* State and Local Agencies:  None.  SOI-DPS does not receive data from State and Local Agencies.

* Other third party sources:  SOI utilizes external data sources such as Mergent Online and Best Insurance reports in perfecting data for statistical purposes.

3.  Is each data item required for the business purpose of the system?  Explain.

Yes.  SOI-DPS is designed to facilitate the collection, production and publication of statistical data for use by customers in the formulation and measurement of legislation relating to taxation as required under IRC 6108.  This statistical data is based on tax and information returns, most of which are filed annually.

4. How will each data item be verified for accuracy, timeliness, and completeness?

* Accuracy:  SOI employees use various mathematical and validation methods to check selected data for accuracy.  As part of this process, actual returns or images of scanned returns may be used to verify the accuracy of the data that is input.

* Timeliness:  Data are based on current year filings.  Most SOI studies are annual. SOI-DPS data retained in the system are static once the study files are completed. 

* Completeness:  The data for selected returns are input into the XXXX XXXX resident on the SOI systems in the Cincinnati and Ogden Submission Processing Centers.  Additional data are extracted from the physical tax return or an image of a scanned return to prepare a complete record for SOI.  After subjecting the data to various consistency and validity checks, records are input for further statistical processing on SOI’s systems in Washington, DC in order to produce data for transmittal to SOI’s customers.

5. Is there another source for the data?  Explain how that source is or is not used.

No.  SOI-DPS data are only collected from selected tax and information return forms in accordance with legislative mandates.  Data are extracted from the Individual and Business Master File systems for sampled returns. 

As various modernization projects are implemented, it is anticipated that the SOI-DPS will receive data from different systems – e.g., the images and associated data from written IRS Forms K-1 that are scanned in the Service Center Recognition/Image Processing System (SCRIPS) system, and the XML data from Modernized tax Return Database (MTRDB).

Except in very specialized circumstances, there is no interface between the IRS and the taxpayer to obtain any additional information.

6. Generally, how will data be retrieved by the user? 

Tax and information returns are pulled and controlling information is entered into the SOI tracking database from all nine IRS Submission Processing Centers (SPCs):

* Andover, MA
* Atlanta, GA
* Austin, TX
* Cincinnati, OH (Covington, KY)
* Fresno, CA
* Kansas City, MO
* Memphis, TN
* Ogden, UT
* Philadelphia, PA

Editing is performed at five of these centers (Atlanta, Ogden, Kansas City, Cincinnati, and Austin), and computer processing is done in two – Cincinnati (CSPC) and Ogden (OSPC).  Samples are selected at the Martinsburg Computing Center (MCC) in West Virginia. Post-production processing is accomplished at the SOI National Office (NO) in Washington, DC. The staff assigned to the NO manages the SOI-DPS environment.

Paragraph Redacted.

The Treasury Communications System/Data Communications Utility (TCS/DCU) is the communications network that connects the SOI-DPS sites across the country. Users at the SPCs, as well as those located in Washington DC at National Office,
Treasury’s Office of Tax Analysis, and the Joint Committee on Taxation, access the system using the DCU component – commonly referred to as the IRS Wide Area Network (WAN) – of the TCS/DCU.

Access to SOI-DPS from outside the IRS WAN is tightly regulated and is controlled through the Service's Secure Dial-In (SDI) security facility.  There is no dial-in connectivity to SOI-DPS except through SDI.

As the host for the SOI Exempt Organization Image Net (SEIN), the SOI Document Image Network (DIN), and the Large & Mid Size Business (LMSB) Image Network (LIN), SOI-DPS SA’s grant authorized IRS users access to scanned images of tax returns. 

In addition, users with proper permissions will use the intranet to connect to the end-user portal to view XML images of electronically filed tax returns.

Finally, data from SOI study files are made available to the public via the “Tax Stats” Internet site.  All the data posted to this site has been summarized or otherwise processed in order to ensure the confidentiality of specific taxpayers.

7. Is the data retrievable by a personal identifier such as name, SSN, or other unique identifier? 

Yes.  Individuals with the proper permissions can access the data and images of returns using Social Security Numbers, Employee Identification Numbers, Taxpayer Identification Numbers, Industry Codes and any other information contained in the database for that return.

Access to the Data

8. Who will have access to the data in the system (Users, Managers, System Administrators, Developers, Others)?

The managers of the various projects control access to the data for a particular SOI study. 

The primary users of SOI-DPS are editors in the Submission Processing Centers (those extracting the data from the various tax returns), National Office analysts (economists and math/stats) working on a particular study, and system developers and testers.  The SAs and the DBAs have access in varying degrees, according to their job responsibilities. 

In addition, personnel from Treasury’s Office of Tax Analysis (OTA) have access to all of the microdata for the various SOI studies; and certain data are accessible by staff at the Congressional Joint Committee on Taxation (JCT) and the Bureau of Economic Analysis (BEA) at Commerce Department.

External, private-sector customers such as the Urban Institute, PRI (Philanthropic Research Institute), and the Foundation Centers are also granted access to images of certain publicly available tax returns.  

9. How is access to the data by a user determined and by whom? 

At the beginning of each study year, lists of users are prepared and permissions are set based on the request of the owners of the data in the SOI program branches.  Permissions are documented on the SOI-DPS User Registration/Change Request Form that is used to grant system access.  A user’s access to the data terminates when it is no longer required. 

Users have access only to that data which they need to perform their job duties.  For editors in the field, access is through a menu restricting them to certain studies and certain permissions on those studies.  For example, if they are involved in quality review, their access will be more extensive than if they are only doing original edit work.

The managers of the various projects control access to the data for a particular SOI study.  The data are organized on the system according to Branch designation and, unless special arrangements are made, individuals are not given even “read” access to data that they do not own.

These same managers, via the IRS Online 5081 system, also control access to scanned images of tax returns stored in the SEIN, DIN, and LIN components of SOI-DPS.

10. Do other IRS systems provide, receive, or share data in the system?  If YES, list the system(s) and describe which data is shared.  If NO, continue to Question 12.

Yes.  SOI-DPS supports SOI complex online transaction processing, electronic data transfers between sites, enhanced file organization and document control, error notification, data perfection, and management information reporting.

At the National Office site, in addition to specialized statistical software used for analytical purposes, the LAN also supports various management information and office automation software packages.  This is required to support project planning, documentation, the SOI tape library, and related peripheral equipment necessary to provide appropriate interfaces.

Additional functionality of SOI-DPS provides access to the data by SOI customers, particularly Treasury's Office of Tax Analysis and the Congressional Joint Committee on Taxation.

At the Cincinnati Submission Processing Center, SOI-DPS also houses the hardware and operating system software for EDS (Exempt Organization Employee Plans Determination System).  Access to the server is controlled by SOI-DPS SA’s in Cincinnati.  Database administration, security certification of the EDS Informix application, and the interface to the Tax Exempt Determination System (TEDS) application are the responsibility of MITS staff at that location.

In order to facilitate access to LIN imaged returns by LMSB staff, SOI provides a weekly extract of document tracking information that is uploaded to the LMSB Workload Information System (LWIS).  This system alerts LMSB staff when images of returns are available for viewing.

SOI also provides return document tracking information to OSPC staff for the purposes of updating the Audit Information Management System (AIMS) / Examination Returns Control System (ERCS) database.

11. Have the IRS systems described in Item 10 received an approved Security Certification and Privacy Impact Assessment?

Yes.  The current Security Certification is valid through October 24, 2004.  Effective PIA is dated June 13, 2000, and PIA Concurrence memo is dated December 18, 2000.

12.  Will other agencies provide, receive, or share data in any form with this system?

Yes.  Personnel from Treasury’s Office of Tax Analysis (OTA) have direct access to all of the microdata for the various SOI studies.  Also, certain data are accessible by staff at the Congressional Joint Committee on Taxation (JCT) and the Bureau of Economic Analysis (BEA) at Commerce Department in accordance with legislative mandates.

SOI-DPS does not receive data from Federal, State, or local Agencies.

Administrative Controls of Data

13.  What are the procedures for eliminating the data at the end of the retention period?

Data in SOI-DPS are required for studies and publication of statistical data for legislation relating to taxation.  Most data are kept indefinitely. 

Most of the time no procedures are necessary for eliminating the data, as they are kept indefinitely.  However, when data are no longer needed, the procedures for eliminating data from SOI-DPS follow OMB Circular A-130 requirements as described in General Records Schedule 20 (IRM 1.15.3, Chapter 20); The Records Management Program, IRM 1.15.1; and IRM 1.15.2, Types of Records and their Life Cycles.

14.  Will this system use technology in a new way?  If "YES" describe.  If "NO" go to Question 15. 

Yes.  The SEIN, DIN and LIN components of SOI-DPS (having evolved from microfilm to digital images) allow editors to view scanned images of paper returns so they can view both the data input forms and the returns themselves, side by side, on large computer monitors.  This has improved editing accuracy and reduced the cycle time required for SOI to process these returns.

As various modernization projects are implemented, it is anticipated that SOI DPS will receive data from different systems--e.g., the images and associated data from Forms K-1 that are scanned in the SCRIPS system, and the XML data from MTRDB.  These data may then be transformed into an on-screen “return” populated with electronically filed data. 

SOI has just begun the process of building an XXXX XXXX, the purpose of which is to store all readable SOI data and to make those data available to authorized SOI-DPS customers, both internal and external.

15.  Will this system be used to identify or locate individuals or groups?  If so, describe the business purpose for this capability.

No.  There are data on the system that can provide the capability to identify or locate individuals or groups of people. 

However, this system of records is used solely for statistical purposes and not for compliance or any other “monitoring” activity.  Any data released to the public are formatted and designed to prevent the identification or location of any particular taxpayer or group of taxpayers.

16. Will this system provide the capability to monitor individuals or groups? If yes, describe the business purpose for this capability and the controls established to prevent unauthorized monitoring.

Yes.  There are data on the system that can provide the capability to monitor individuals and groups.  For example, certain SOI panel studies track selected taxpayers over time.

However, this system of records is used solely for statistical purposes and not for compliance or any other “monitoring” activity.  Any data released to the public are formatted and designed to prevent the monitoring of any particular taxpayer or group of taxpayers.

17. Can use of the system allow IRS to treat taxpayers, employees, or others, differently?  Explain.

No.  There is no possibility of disparate treatment of individuals or groups because the statistics, studies, and compilations are designed so as to prevent disclosure of any particular taxpayer’s identity.

18.  Does the system ensure "due process" by allowing affected parties to respond to any negative determination, prior to final action?

Not Applicable.  The system ensures equitable treatment because the data collected are for statistical rather than tax administration purposes.  Any data released to the public are formatted and designed to prevent the identification of any particular taxpayer.

19.  If the system is web-based, does it use persistent cookies or other tracking devices to identify web visitors?

Not Applicable.  SOI-DPS is not a public web-based system.  Although some application components are web-based within SOI-DPS, none of these components allows access by any user who has not already been granted such access via either IRS Form 5081 or the SOI-DPS User Registration/Change Request Form.  There is no need to track anything but pre-authorized users, as there are no “web visitors.” 

 


Page Last Reviewed or Updated: November 19, 2004