Accessibility Skip to Top Navigation Skip to Main Content Home  |  Change Text Size  |  Contact IRS  |  About IRS  |  Site Map  |  Español  |  Help  
magnifying glass
Advanced Search   Search Tips

Customer Account Data Engine 2.2

 

Privacy Impact Assessment- Customer Account Data Engine (CADE)
System Overview:

The CADE system is an MA using a combination of commercial off-the-shelf (COTS) products with custom developed software provided to the IRS under contract with the PRIME. The general purpose of CADE is to accept, validate, and store taxpayer and tax return information. CADE is a key component of the modernized application environment as an enabler of greatly improved processing.  The Business Unit owner of CADE is the Wage and Investment (W&I) division of IRS.
Over several releases, CADE will create accurate, current, authoritative data stores as the Taxpayer Account Database (TADB) and the Tax Return Database (TRDB) in the Modernization Blueprint (2000). CADE will also construct related tax administration systems processes, incrementally replacing the IRS master files.

Requested Operational Date:  November 21, 2006 (for Milestone 4a Exit)

Systems of Records Notice (SORN):
CADE is currently covered by Treasury/IRS 24.030, CADE Individual Master File (IMF)
Treasury/IRS 24.046 Business Master File (BMF)
Treasury/IRS 34.047 IRS Audit Trail & Security Records

Introduction

The following PIA is provided for the Customer Account Data Engine (CADE) Release 2.2 Milestone 4A.  CADE is an Internal Revenue Service (IRS) application/system that has been categorized as a major application. CADE will provide IRS employees with accurate authoritative data for post-filing applications, such as compliance, notices, and revenue accounting.
1.1 Release 2.2 Changes

Release 2.2 maintains the basic functionality of CADE but includes these changes:

* Non-Return Based Address Change
* SCREF Interface  (From DCUT through deployment)
* Married Filing Jointly & Separately without dependents (Married Once) (From DCUT through deployment)
* “Clean” Dependents (From DCUT through deployment)
* Head-of-Household without Dependents or with “Clean” Dependents
* Annual Archiving
* Incremental Update of LAFF from LAFFOL  (From DCUT through deployment)
* Limited Name Change on Return
* 1040 Schedules A, B, and R
* 1040A Schedules 1, and 3
* Filing Season Changes
* Outbound Social Security Administration (SSA) Interface
* Census Bureau Interface
* Make FFMSR certification Current
* EA version 2.5 Compliance
* 1040 Schedules C, E, F w/o EIN their supporting forms, including Sch. SE
* 1040 Schedule D and its supporting forms
* SET Req #73: Reuse CPE 460-02 load module to populate LRFF (Option 1)
* Updated ISA with FMS

Perhaps the most specific change  is the addition of interfaces to SSA and Census. These are described in greater detail in the PIA. These are not new data exchanges for the IRS. Previously, this exchange was accomplished through IMF. The change is that these may occur directly from CADE to the external agency (note: IMF data to Census  is currently via tapes. The CADE Project Office is exploring electronic transmission of IMF and CADE data to Census). In the case of the providing data to Census, CADE plans to merge 2 files with existing IMF files that are currently provided to Census.

1. Data in the System

1.1 Generally describe the information to be used in the system in each of the following categories: Taxpayer, Employee, and Other.

Taxpayer–CADE maintains information about taxpayers for the purpose of administering the tax code. Information maintained on individuals is  generally derived from information submitted by the taxpayer on tax forms 1040, 1040A, 1040EZ including those with schedules for credit interest and ordinary dividends (1) and disabled credits (3). Those taxpayers filing these forms as single, married filing jointly, married filing separately, and head of household are in scope processing for CADE in the current release of CADE. For detailed information regarding the data elements in CADE, please consult the data dictionary found in Appendix A of this document.

In addition to the taxpayer data in the production environment, the CADE System Acceptability Test (SAT) effort uses copies of taxpayer data to complete testing activities in the development environment. Copies of production data are obtained two ways.

* Once a year, copies of all Individual Master File weekly and monthly files are copied for a specific cycle. A copy of the CADE database for that same cycle is also acquired.  A listing of the specific files is provided to the staff at XXXXXXX.  They copy the data files to high level qualifiers used only by the SAT team.  A copy of the listing used to complete the most recent copy is attached.

* During the filing season, copies of files containing return records from submission processing centers are copied by the test execution team.  Those files are copied to the same high level qualifier used for the annual data copy activity.

Employee–CADE also maintains employee information in the form of audit logs that record requests from CADE’s system administrators, however no other personal information is used by CADE. Four classes of Audit Trails are available in the CADE system, each of which has its own categories of auditable events. The relevant audit trails for CADE include the following areas:

* RACF Security Monitoring (at the MITS GSS-21 level)
* RACF for DB2 Security Monitoring (CADE related DB2 events captured by RACF)
* DB2 Transaction logging (CADE DB2 subsystem events monitored by DB2)
* CADE Subsystem logging (Application logging for CADE subsystem events).

The CADE DB2 database currently retains information about access attempts denied due to inadequate authorization, explicit GRANT and REVOKE, assignment or change of authorization ID.  In addition, the audit logs track the time and date of the event, and the user ID associated with events which fail.  Refer to the XXXXXXXXXXXXXXX for details surrounding the audit controls for the CADE portions that are accessed through CFOL and the XXXXXXXXXXXXXXX for the CADE portions that are accessed through RACF.

RACF Security Monitoring Events
The RACF security auditing features are enabled to ensure that audit trails are produced by the system.  The audit trail allows identification of auditable events and for the management of audit trails (logs) in a secure environment.  The RACF Law Enforcement Manual (LEM) audit specification includes:

* Logon UserID
* Logon Terminal ID
* Password Change Including UserID
* Password Change including Terminal ID
* File Create including file name
* File Delete including file name
* File Open including file name
* Time/Date Stamping

RACF for DB2 Security Monitoring Events
The CADE XXXXXXXXXXX is audited for the following events:
* Access attempts denied because of inadequate authorization
* XXXXXXXXXXXXXXXXXXXXXXXXX assignment or change of authorization ID.

In addition, the audit logs track the time and date of the event and the userID associated with events that fail.  To turn auditing on the CADE objects, the CADE tables are defined with the AUDIT ALL attribute. This attribute specifies auditing performed for the first operation of any kind performed by each unit of work of a utility or application process. The impact of this attribute on CADE performance is minimal, as it is only in effect for those authorization IDs for which the DB2 Audit Trace XXXXXXXXXXXXXXXXXXXXXXXX are activated.
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.

For CFOL users accessing CADE through an IDRS query a summary description of the SACS interface is given here.   Audit tools create, maintain, and protect a trail of actions produced by users and administrators that trace security-relevant events to an individual, ensuring accountability. Audit trails are produced by the Security & Communications System (SACS) portions of the system through which all users must pass to get access to the TIF database used by IDRS.  All end user activity is gathered and stored by the system log. Audit data consist of 10 digit assigned unique employee number, 10 digit case number and the tax period, for example, 0602 (indicates a return filed for tax year ending June of 2002 ).
Other–There are no other individuals in the system.

1.2. What are the sources of the information in the system?

CADE has several interfaces that are the source of personal information in the system. Four of these interfaces are with the Individual Master File (IMF). The IMF interfaces that are the source of personal information include:

1. IMF to CADE Initialization– CADE is initialized with taxpayer accounts, containing five years or less of tax modules.
2. 701 Exec Return Transaction File (RTF) – Initialization:  Taxpayer returns that have been validated and accepted by the IMF. This transfer data from the CPE to CADE is through shared DASD or tape files controlled by RACF and  NCONTROL. This activity occurs during the initialization and as part of the normal processing.
3. Communicate to CPE to IMF Annual Conversion–Occasionally during the mid-year, IMF must make changes to the layouts of Entity and/or Tax Modules. The same layout changes must be made to the Entity and Tax Modules in CADE. Therefore, after completing the final CADE processing cycle of each year (Cycle 52/53), the data from LAFFOL will be extracted, and the LAFF portion of each record will be made available as data files for CPE. This CADE data file will be reformatted into the new layout as part of the annual IMF Conversion Run (INY 440-01) and returned to CADE via this interface.
4. CADE Router/Filter–Receives one or more incoming IMF transactions from the service centers, non-service centers, and external trading partners, including FMS acknowledgement. Each one of these transactions is processed to determine whether it should be further processed by CADE or forwarded to IMF.

In addition to the taxpayer records from IMF, CADE has the following sources from the IRS, Treasury and SSA, respectively:

5. Direct Deposit Limitation File–The IRS maintains a file (460-07-10) of bank accounts utilized for direct deposit of Form 1040 tax refunds. The file includes the Routing and Transit Number (RTN) of the financial institution, the specific Bank Account Number (BAN) being accessed, and the number of refunds deposited to the account during the current processing year (PY). The file also contains limited tax return information for each of those refunds. Specifically, the following data elements are included:

a. ROUTING AND TRANSIT NO.
b. BANK ACCOUNT NUMBER
c. UNPOSTABLE CODE
d. MARRIED FILING SEPARATE
e. RECORD COUNT
f. VALIDITY DIGIT
g. SSN
h. DLN
i. MFT
j. TAX PERIOD
k. FILING STATUS
l. CYCLE
m. ZIP CODE
n. REFUND AMOUNT
6. Financial Management Services (FMS)–FMS, a Treasury system, will return an acknowledgement to CADE when a refund is paid or an offset occurs. This acknowledgement has the following data elements:

a. RUN DATE
b. ALC
c.  RUN TIME
d.  SC CAMPUS
e. AGENCY FILE DSN
f. SCHEDULE NUMBER
g. CONTROL NUMBER
h. FILE ID: IRSIND
i. IM FILE DSN
j. FILE TYPE
k. CYCLE IN FILE:          
l. ERROR NOTICE PAYMENTS
m. REGULAR PAYMENTS
n. GRAND TOTAL

7. Treasury Offset Program (TOP) Debt File to CADE Each week, FMS provides the IRS with a file containing the Taxpayer Identification Number (TIN) of all the entities currently identified as having outstanding Federal debts covered under TOP. The IRS validates and sorts this file and then uses the data (File DMF-20-11) to mark refunds (Posted and RFIF) as being subject to reduction or elimination based on a TOP offset. The IRS also uses this file to trigger the conversion of a taxpayer’s requested Credit Elect of an overpayment (to a subsequent obligation) to a refund, up to the amount of the TOP debt. The specific data elements include:

a. TAXPAYER IDENTIFICATION NUMBER (TIN)
b. TAX YEAR
c. DEBT AMOUNT

8. Data Master File (DM-1) – Initialization–DM-1 is a file that originates from the SSA with valid name controls and social security numbers.  The DM-1 is used to supply the date of birth and citizenship indicator values for taxpayers. Specifically, IRS uses this file to validate social security numbers in tax returns because only the Social Security Administration has the authority to add or change social security accounts. For example, if an individual marries and changes her last name this can only be verify by the DM-1 file from the Social Security Administration.  The citizenship indicator is used to determine eligibility for programs such as Earned Income Credit (EIC).  The specific data elements in this file are as follows:
a. TIN
b. TIN TYPE CODE
c. UPDATE DATE (YYYYMMDD)
d. DATE OF BIRTH (YYYYMMDD OR '00000000')
e. DATE OF DEATH (YYYYMMDD OR '00000000')
f. GENDER
g. CITIZENSHIP
h. DM1 GROUP COUNT
i. VALIDATION SOURCE CODE
j. NEW THIS QUARTER INDICATOR
k. NAME CONTROL

1.2.a. What IRS files and databases are used?

When initialized, CADE will contain individual taxpayer-related information presently contained in the following legacy files and databases:

* The Individual Master File (IMF)
* The Return Transaction File (RTF)
* Data Master File (DM-1).

In addition to the yearly extract CADE receives data from the following:

* Individual Master File (IMF)
* Generalized Mainline Facilities (GMF)
* Financial Management Services (FMS).

CADE data files will be created and sent to CPE for processing. CADE will send data to the following systems:

* Refund Information File (RFIF)
* Questionable Refund Program/Refund Interest Program/Electronic Tax Administration (QRP/RIP/ETA)
* Duplicate Direct Deposit (DDD)
* Statistics of Income (SOI)
* 701 Exec, Microfilm Replacement System (MRS)
* IMF Weekly Reports
* Return to CPE and IMF
* Taxpayer Address Request (TAR) – Legacy Account Formatted File (LAFF) Summary
* Corporate Files Online (CFOL)
* Interim Revenue Accounting Control System (IRACS) Recap Data
* Interim Revenue Accounting Control System (IRACS) Refund Data
* Financial Management Information Systems (FMIS)
* Reciprocal Accounting Control Record (RACR)
* XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Processing Validation Section Recap Information (XXXXXXX_PVS Recap)
* XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Processing Validation Section Refund Information (XXXXXXX_PVS Refund)
* Individual Master Files (IMF)
* Electronic Certification System (ECS)
* Microfilm Replacement System (MRS)
* Individual Return Transaction File Online (IRTFOL)
* Return Transaction File (RTF)
* Refund Timeliness Program (RTP)
* Enterprise System Management (ESM)
* Taxpayer Account Transcripts
* Send to Current Processing Environment (CPE)
* LARS-format Balance and Control Data
* Balancing Reports
* Security Administration System (SAS) Reports
* Service Center Input Processing Automation System (SCIPAS) Reports
* Accountability Acceptance Vouchers (AAV) Reports
* Obligation Balance Validation Reports
* Weekly Obligation Balance Data
* CADE Initialization to IMF.
* IMF Annual Conversion
* CPE for Address Change – this information goes to the following CPE systems:
o Enhanced Entity Index File (EEIF)
o Key Index File (KIF)
o Name Search File (NSF)
o National Account Profile (NAP)
o Address Error Report
* CADE R2CPE Reports
* CPE for Discriminant Index Function (DIF) Processing
* Social Security Administration Self Employment Data
* US Census Bureau Economic Data (Census)

1.2.b What Federal Agencies are providing data for use in the system?

CADE receives the DM-1 file from SSA and the TOP Debt File and FMS acknowledgement files from Treasury as described above. The DM-1 and TOP Debt File provide current and authoritative information. The FMS is acknowledgement that the refund or offset requested by CADE was completed.

1.2.c. What State and Local Agencies are providing data for use in the system?

No state or local agencies provide data to CADE.

1.2.d. From what other third party sources will data be collected?

There are no other third party sources.

1.3.a. How will the data collected from sources other than IRS records and the taxpayers be verified for accuracy?

CADE has strong data integrity, validation, and balance and control process in place to ensure the information processed is accurate. CADE’s data originates from taxpayer submitted information – the 1040 – and from authoritative sources including IMF and SSA to ensure taxpayer information is accurate. Additionally, accuracy, completeness and validity controls for CADE are designed into the subsystem modules for Router Filter, Transaction Processing (TP), Daily Processing, Restore to CPE, Balance and Control, and CPE Interface.
TP maintains taxpayer account and return data in the CADE system. It processes messages received from the Accept IMF Transaction and Format for CADE (IMF2CADE) CI. TP identifies conditions that require the restore of the taxpayer account(s) from the CADE system back to the current processing environment (R2CPE) CI.

Send It To CPE, also known as Send It Back (SIB), situations can occur against three types (status) of taxpayers as they relate to CADE. The first type is taxpayers that are not in CADE. The second group is those taxpayers that have been initialized/migrated into CADE but have not yet experienced processing by CADE. The final type is those taxpayers that have been initialized/migrated into CADE, and have had the tax return (TC150) posted to the modernized database. Against these taxpayer types, various stimuli can trigger a restore to CPE request. These include invalid/out-of-scope transactions (or any transaction if the taxpayer is not in CADE), a tax return (SCRS TC150) that is either invalid or fails business rules, and failure to pass FINALIST validation. CADE’s responses to these R2CPE triggers include sending the triggering transaction to CPE (occurs on a daily basis, =”D”), and restoring taxpayer information to CPE (occurs on a weekly basis, =”W”). Either or both can occur depending on the taxpayer type/stimulus combination.
FINALIST validation refers to both Batch and Online checking of searches, coding and standardization of address data input to CADE. Both versions of FINALIST verify address information being loaded and format address information into the appropriate data structures. CADE employs two flavors of Finalist: (1) a batch flavor invoked by batch program calls or JCL procedures and (2) an online flavor invoked by CICS program links. CADE’s Initialization processes use the batch version of Finalist to verify address information being loaded from IMF into CADE for accuracy and also to format address information into the appropriate data structures. CADE’s online CICS transaction-processing components use Finalist to perform the same functions as batch, but for address information received during CADE online transaction processing.

If errors occur, The CADE Operator’s Guide, Section 5, Alert Messages and Error Exceptions, details the following procedures:

* Format for CADE Application Error Messages
* Listing of CADE Application Alerts
* Format for CADE Error Exceptions (DB2/CICS)
* Error Queue Processing

Refer also to the CADE Operator’s Guide, Appendix D, Correcting Error Messages in CADE Error Queue (excerpted below).

There is one CADE Error Queue on each production LPAR. When CADE cannot process a message, it will post the original message to the CADE Error Queue, along with an error code (defined in Table 4-3 below).  After all the messages posted to the CADE Error Queues are automatically recycled by job, CAD86A2/A3, any remaining messages will have to be treated by the error correction process. The operator will open an ITAMS ticket and contact CADE Product Support.

1.3.b. How will data be checked for completeness?

CADE’s data originates from authoritative sources. Files from the original sources are checked as they enter CADE and are rejected if the information is not complete. 1.3.a provides additional information on completeness.

1.3.c. Is the data current?  How do you know?

Prior to the start of each tax season, CADE undergoes a process entitled initialization. In initialization, the CADE database is erased and the current year’s tax processing begins. Extracted Current Processing Environment (CPE) data to a modernized format and populate CADE’s modernized database with this data to establish a repository of taxpayer and account data to CADE with the segment of taxpayer supported by the release. Annually, CADE receives the DM-1 file which provides updated, current information regarding date of birth information from the SSA and is considered an authoritative source.

1.4. Are the data elements described in detail and documented?  If yes, what is the name of the document?

A Release 2.2 data dictionary provides the data elements in CADE and can be found in Appendix A. Release 2.2 Interface Control Documents provide the specific data elements that make up each file transferred within a given interface.

2. Access to the Data

2.1. Who will have access to the data in the system (Users, Managers, System Administrators, Developers, Others)?

The following employees (IRS and Contractors) who serve as privileged users have access to CADE data:

* Database Administrators – Database Administrators design and administer the data warehousing, platform application dependencies, performance, maintenance and versioning.
* RACF Administrators – RACF Security Administrators will provide assistance in the installation, maintenance, optimization, integration, backup, and recovery of RACF. The RACF Security Administrator will also execute and support security policies and procedures.
* System Operators – System operators are responsible for day-to-day operation of CADE and associated applications, utilities, and management of both incremental and full backups of the system.
* System Programmers – The principle focus of this position is the installation and maintenance of application and system software, troubleshooting, capacity planning, and performance monitoring.
* System Developers –The CADE SAT effort uses copies taxpayer data to complete testing activities.
* Contract Employees – The system programmers and developers are frequently contract employees.

Taxpayers: none

* Other: The XXXXXXX Computing Center (XXXXXX) serves as the security auditors for XXXXXXXXXXX Computing Center (XXXXXX). XXXXXX  monitors and reviews the activities of CADE DBAs, RACF Administrators, and Systems Programmers within the CADE data sharing group and report to XXXXXX Security any violations or questionable activities. There remain legacy risks around auditing for CADE.

2.2. How is access to the data by a user determined?  Are criteria, procedures, controls, and responsibilities regarding access documented?

The OL5081 is used to document access requests, modifications, and terminations for all types of users, including system administrators, and test accounts. When a new user needs access to IRS systems or applications, the user’s manager, or designated official, completes an OL5081 requesting access for the new user. OL5081 is an online form, which includes information such as the name of the system or application, type of access, and the manager’s signature approving authorization of access. The completed OL5081 is submitted to the security or user administrator, who assigns a user ID and an initial password. Before access is granted, the user is required to digitally sign OL5081, acknowledging his/her security responsibilities when using the system.

The following procedures are fully documented XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.
* Online 5081 Form is also used for requesting user access to DB2 and TSO Login userIDs on the XXXXXX mainframe.  Access to XXX is granted through manager approval and only granted to perform job duties. Once management has approved the request based on a need-to-know., the DBAs add user access.
* XXXXXX Form 104, RACF IBM Compatible IBM Compatible Masterfile Mainframe Systems Data Set Profile/General Resource Profile Access Request Form, is used to request specific access to Production Data Set Files or RACF General Resource profiles.
* XXXXXX Form 45, CA-Dispatch DataBase Information Form, is used to establish reports in CA-DISPATCH and to request the assignment of access to the reports by RACF user groups. These forms are available from the XXXXXXX Systems Programming Branch.

2.3. Will users have access to all data on the system or will the user's access be restricted?  Explain.

Access to CADE is granted as described above using the OL5081 process. Access is restricted to administrative users on a need-to-know basis. XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/RACF is used on the mainframe, utilizing the RACF for DB2 security features for access controls and authorizations. Group authorizations are used wherever possible while capturing individual accountability, as required. Role-based access, profiles, and separation of duties ensure that users can only access data as required by their jobs. By the nature of the mainframe environment administrative users have access to taxpayer data.

2.4. What controls are in place to prevent the misuse (e.g. browsing) of data by those having access?

While CADE and its privileged users are subject to UNAX requirements like all employees and would be disciplined for violations, there is no systemic way of enforcing the UNAX policies. Some activities by the administrative users require that the users get a block of records instead of an individual taxpayer record. In this instance the DB2 logs would only reflect the first and last row of the record block accessed and would not show which of the records within the block the user actually viewed. Also, for this CADE Release this information is collected, but XXXXXXX does not currently have the capability to run reports against the DB2 logs.

Another problem with auditing administrative users is that currently XXXXXX does not have any negative TIN checking ability for administrative users to comply with the UNAX requirements. Even when the TIN number being accessed by an administrative user is registered by the system there is no method to know whether that user should be allowed to access that record. A means to block administrative users from accessing records of relative, friends, neighbors, etc. does not exist.

Controls in place, including all CADE developers must take Live Data training on the proper handling of live data. This complements the required UNAX and security awareness training (soon to be replaced by the Information Protection training) for all employees and contractors. Once individuals are identified as having a need to know and gain access through the OL8051 process.

The existing Unauthorized Access and Inspection of Taxpayer Records (UNAX) policies, procedures, and practices are applicable to CADE. XXXXXXX uses the policies and procedures in the following documents to ensure that unauthorized individuals do not read, copy, alter, or steal printed or electronic information:
* XXXXXXX Standard Operating Procedure (SOP) No. 2.1-05, Request for Taxpayer Information
* XXXXXXX SOP No. 2.1-16, Production and Distribution of Non-Tape Output
* IRM 1.3, Chapter 1, Disclosure of Official Information Handbook.

Individuals with access to CADE are reminded of their responsibility and the potential for monitoring through a warning banner. RACF is used to authenticate and authorize users. Access controls will not be less than those provided by the RACF Law Enforcement Manual (LEM) 2.1.10.7.2.

The Security Auditor at XXXXXXX has the capability to audit the activities of the administrative users to monitor for some UNAX violations, the creation/modification/deletion/query of CADE data records, and changes that have been made to group and user privileges. This auditing process is done using RACF and Vanguard reports.

2.5.a   Do other systems share data or have access to data in this system?  If yes, explain.

CADE is part of an integrated, enterprise-wide Tax Administration system and is the single authoritative source of Tax Return and Tax Account data for the taxpayer records that are part of CADE. The data will be provided via messaging APIs to other applications. CADE will transmit data to the following systems:

* Refund Information File (RFIF)
* Questionable Refund Program/Refund Interest Program/Electronic Tax Administration (QRP/RIP/ETA)
* Duplicate Direct Deposit (DDD)
* Statistics of Income (SOI)
* 701 Exec, Microfilm Replacement System (MRS)
* IMF Weekly Reports
* Return to CPE and IMF
* Taxpayer Address Request (TAR) – Legacy Account Formatted File (LAFF) Summary
* Corporate Files Online (CFOL)
* Interim Revenue Accounting Control System (IRACS) Recap Data
* Interim Revenue Accounting Control System (IRACS) Refund Data
* Financial Management Information Systems (FMIS)
* Reciprocal Accounting Control Record (RACR)
* XXXXXXXXXXXXXXXXXXXXXXXXXXX/Processing Validation Section Recap Information (XXXXXXX_PVS Recap)
* XXXXXXXXXXXXXXXXXXXXXXXXXXX/Processing Validation Section Refund Information (XXXXXXX_PVS Refund)
* Individual Master Files (IMF)
* Electronic Certification System (ECS)
* Microfilm Replacement System (MRS)
* Individual Return Transaction File Online (IRTFOL)
* Return Transaction File (RTF)
* Refund Timeliness Program (RTP)
* Enterprise System Management (ESM)
* Taxpayer Account Transcripts
* Send to Current Processing Environment (CPE)
* LARS-format Balance and Control Data
* Balancing Reports
* Security Administration System (SAS) Reports
* Service Center Input Processing Automation System (SCIPAS) Reports
* Accountability Acceptance Vouchers (AAV) Reports
* Obligation Balance Validation Reports
* Weekly Obligation Balance Data
* CADE Initialization to IMF.
* IMF Annual Conversion
* CPE for Address Change – this information goes to the following CPE systems:
o Enhanced Entity Index File (EEIF)
o Key Index File (KIF)
o Name Search File (NSF)
o National Account Profile (NAP)
o Address Error Report
o CADE R2CPE Reports
* CPE for Discriminant Index Function (DIF) Processing
* Social Security Administration Self Employment Data
* US Census Bureau Economic Data

2.5.b. Who will be responsible for protecting the privacy rights of the taxpayers and employees affected by the interface?

The IRS is responsible for protecting the privacy rights of the taxpayers affected by the interface. Most interfaces are internal to the IRS and security and privacy controls are governed by the ICDs, PIA, and the System Security Plan (SSP) for CADE. Additionally, external interfaces are governed by Interconnection Security Agreements (ISAs). These documents provide specific details about how information is to be transferred and protected.

2.6.a. Will other agencies share data or have access to data in this system (International, Federal, State, Local, Other)?

The following Federal agencies receive data from CADE:

* Refund data to the Treasury Department and the Chief Information Officer (CIO),
* Refund data to FMS,
* Schedule SE and C data to Census, and
* Transcript data will be available to Government Accountability Office (GAO).

2.6.b. How will the data be used by the agency?

Data from CADE will be used by the receiving divisions of the agencies for the following.
FMS will use refund information transmitted by CADE to do refund processing and disbursement.
The Chief Financial Officer (FMS Reports) will use data from the CADE reports to take corrective actions on (among other things misdirected Electronic Fund Transfer (ETF) refunds. These reports will be part of CADE’s process to meet the Joint Financial Management Improvement Program (JFMIP) requirements for detailed disbursement confirmation. Taxpayer Account Transcripts reports taxpayer information to allow auditors to review the processing of CADE. This is intended for the use of GAO auditors. The file provided to census extracts the previously formatted records from LRFFOL for taxpayers that have filed tax return with Schedules SE or C and write them into the Census Data Records file for demographic and statistical purposes. The SSA uses the information to verify self-employment information.

2.6.c. Who is responsible for assuring proper use of the data?

The Designated Accrediting Authority (DAA) of the receiving agency will be responsible for the proper use of that data at the agency. The data exchanges between CADE and external interfaces are governed by Interface Control Documents (ICDs), Interconnection Security Agreements (ISAs) and/or Memorandums of Understanding (MOUS) depending on the method of transfers and whether the interface is internal or external to the IRS. The DAA is a senior agency official, who manages the mission or function supported by an information system or network, or has authority to evaluate the overall mission requirements of the information system or network.

2.6.d. How will the system ensure that agencies only get the information they are entitled to under IRC 6103?

The Interconnection Security Agreements (ISAs) and the ICDs between CADE and the systems receiving data from CADE define the data that the system will receive, the reason for the data exchange, retention period for this data, and how the data will be used.
Flow of the data contained in the data files coming to CADE from its external data sources is controlled by explicit file names specific to CADE that designate the service center the data is coming from, the date of the file, and whether is a data file or control file.

3. Attributes of the Data

3.1. Is the use of the data both relevant and necessary to the purpose for which the system is being designed?

The data maintained in CADE is both relevant and necessary to support the Tax Administration and reporting requirements of the IRS. The CADE project will build the data stores illustrated as the Taxpayer Account Database in the Enterprise Architecture. CADE is built based on the framework established for tax processing under IMF. The contents of files as established by the Interface Control Documents, then, are based upon IMF and have not been reviewed to ensure that each and every data element for each file remains necessary for a modernized system.

3.2.a. Will the system derive new data or create previously unavailable data about an individual through aggregation from the information collected?

CADE will contain only taxpayer data that is currently part of the master files; no new data will be stored or derived about the individual taxpayer. CADE is the system and database that will replace the aging Masterfile System used for tax administration. All the data received and stored in CADE is deemed necessary for tax administration.

3.2.b. Will the new data be placed in the individual's record (taxpayer or employee)?

No new data will be collected or placed in the taxpayer’s record. The activity data on administrative users captured in audit logs will not be placed in the employee’s record.

3.2.c. Can the system make determinations about taxpayers or employees that would not be possible without the new data?

During security audits, the activity logs could allow XXXXXXX to make determinations as to the appropriateness of the administrative user activities. For example, whether the user’s activities are in accordance with the UNAX restrictions, as defined in IRS Document 10281, Employees’ Guide, Safeguarding Taxpayer Records and in IRS Document 10280, Managers’ Guide, Safeguarding Taxpayer Records (Note: No UNAX auditing is currently done and this limitation and risk has been identified and remains in place).

3.2.d. How will the data be verified for relevance and accuracy?

The activity data in the audit logs will be captured directly from the system. IRM 10.8.1 requirements verify its relevance. The accuracy can only be verified by the assurance that the system has been implemented correctly. The Certification and Accreditation of CADE will provide this assurance. CADE’s data transmissions are detailed in ICDs and are thoroughly vetted.

3.3.a If the data is being consolidated, what controls are in place to protect the data and prevent unauthorized access?  Explain.

No data is consolidated in Release 2.2.

3.3.b If processes are being consolidated, are the proper controls remaining in place to protect the data and prevent unauthorized access?  Explain.

No processes are being consolidated as part of CADE Release 2.2.

3.4. How will the data be retrieved? Can it be retrieved by personal identifier? If yes, explain.

CADE data is retrievable by the taxpayer identification number (social security number or employer identification number), document locator numbers and alphabetically by name.

3.5. What are the potential effects on the due process rights of taxpayers and employees of:

a. consolidation and linkage of files and systems;
b. derivation of data;
c. accelerated information processing and decision making;
d. use of new technologies;
 
The administrative users will be able to retrieve data by Taxpayer Identifier Number (TIN) or by name.

Consolidation and linkage of files and systems – CADE does not consolidate or link any files or systems that are not currently consolidated or linked in master files.

Derivation of data – CADE contains taxpayer data, including Social Security Number, name, address, and financial information. This data could be used to identify and locate the taxpayer and obtain income and tax information. Browsing or misuse of this information would be a significant privacy issue. The activity data on administrative users that is being collected in the audit logs could potentially reveal UNAX and security violations by administrative users, resulting in administrative action being taken.


Accelerated information processing and decision making – Data will be timely and accurate and thus will result in improved service to and about the taxpayer. Daily processing will prevent inaccurate information being disseminated to taxpayers and IRS employees.

Use of new technologies – The use of RACF for XXX will increase the protection given to the database and the ability to audit user access to taxpayer data.

How are the effects to be mitigated?

There are not sufficient controls to enforce UNAX policies. However the following controls help prevent unauthorized monitoring:

* Clear separation of duties between the RACF Administrators, DBAs, Operations Personnel, and the XXXXXXX during security audits. These positions require complete separation of duties, for example a RACF Administrator can only perform security audits if that RACF Administrator works in a different computing center and has only auditing privileges at the computing center being audited.
* Other access to the RACF system-level audit information is restricted to those people with a strict need-to-know, such as the personnel performing audits and the Treasury Inspector General for Tax Administration (TIGTA).
* The audit information is protected using a RACF profile.
* UNAX and privacy training is required for all personnel, IRS and contractor.
* All administrators are subject to UNAX and current security policies regarding disclosure of information. (NOTE: While all administrative users of CADE are subject to UNAX policies, CADE has an identified risk regarding inadequate UNAX protections).
* Disclosure of returns and return information may be made only as provided by (1) 26 U.S.C. 3406, and (2) 26 U.S.C 6103.

4.Maintenance of Administrative Controls

4.1.a. Explain how the system and its use will ensure equitable treatment of taxpayers and employees.

With the exception of the administrative users CADE Release 2.2 data will be accessed by batch jobs only, preventing taxpayers from being treated in other than an equitable manner. Administrative users will have access to CADE data. This access is granted by their RACF privileges. These privileges are determined by the XXXXXXX administrators based on the activities that CADE requires. Access to these privileges requires that the user have on file an IRS 5081 form and an IRS 104 form, both of which require a manager’s approval and signature. Defining these privileges is the responsibility of XXXXXXX and is not controlled by CADE. DB2 will collect administrative user activity data, however there is no inequitable treatment since this data will be collected on all administrative users. Currently there is not a capability to report on this data, however the IRS is trying to work out a resolution that will allow XXXXXXX to check for UNAX violations. This resolution would require XXXXXXX to acquire through new software, either COTS or written in-house, with the capability to log all activities of the administrative users and to report on these activities. IRS Modernization Security, the PRIME Security and Privacy Office, and XXXXXX are all involved in the effect to ensure that this gap in the privacy protection is resolved.

Administrative users for CADE with possible access to CADE data are:
* RACF Security Administrator – RACF Security Administrator will provide assistance in the installation, maintenance, optimization, integration, backup, and recovery of RACF. The RACF Security Administrator will also execute and support security policies and procedures.
* DB2 Database Administrator – This classification covers individuals key to the design and administration of data warehousing, platform application dependencies, performance, maintenance and versioning.
* Systems Programmer – Principle focus of this position is the installation and maintenance of application and system software, troubleshooting, capacity planning, and performance monitoring.
* System Operators – Operators are responsible for day-to-day operation of CADE and associated applications, utilities.
* System Developers – Develop system and may have access to live data during the testing of the system.
* Contract Employees – Contractors often serve as system programmers, operators and system developers.
* Other: The XXXXXXXXXXXXXXXXXXXXXXX (XXXXXXX) serves as the security auditors for XXXXXXXXXXXXXXXXXXXXXXXXXXXX (XXXXXXX). XXXXXXX  monitors and reviews the activities of CADE DBAs, RACF Administrators, and Systems Programmers within the CADE data sharing group and report to XXXXXXX Security any violations or questionable activities. There remain legacy risks around auditing for CADE.

XXXXXXXXXXXXXXXXXXXXXXXXXXX/RACF will be used on the mainframe, utilizing the RACF for DB2 security features for access controls and authorizations. Group authorizations will be used, wherever possible, while capturing individual accountability, as required. Role-based access, profiles, and separation of duties will ensure that users can only access data as required by their jobs.
Disclosure of returns and return information may be made only as provided by (1) 26 U.S.C. 3406, and (2) 26 U.S.C 6103. All other existing policies, practices, and procedures (e.g., UNAX) continue to be enforced.

4.1.b. If the system is operated in more than one site, how will consistent use of the system be maintained at all sites.

Authoritative CADE data will be maintained at XXXXXXX and CADE will be operated only at XXXXXXX.  Data backups will be warehoused at XXXXXXX and at the off-site storage. CADE will also be resident at the XXXXXXX, however this version of CADE will only be used for disaster recovery.
Live Data is used in the development area of The CADE SAT effort uses copies taxpayer data to complete testing activities.
 
4.1.c. Explain any possibility of disparate treatment of individuals or groups.

CADE provides no processing that would result in disparate treatment of individuals or groups. All other existing policies, practices, and procedures that ensure equal and fair treatment of all taxpayers (e.g., UNAX) continue to be enforced as completely as the current capabilities of XXXXXXX allow. As part of the normal CADE processing of a Tax Return a Discriminant Index Function (DIF) score is computed. This score, along with other factors, is used to determine whether a Tax Return should be further examined for possible fraud. The DIF scoring is an established process used by both CADE and IMF, but the score received from this calculation can cause Tax Returns to be processed differently
Disclosure of returns and return information may be made only as provided by (1) 26 U.S.C. 3406, and (2) 26 U.S.C 6103.

4.2.a. What are the retention periods of data in this system?

Records are maintained in accordance with Records Disposition Handbooks, IRM 1.15.2.1 through IRM 1.15.2.31. The following retention periods are followed for CADE:

* Data Control and Accounting Records – Destroy one year after the end of the processing year, including the following:
? All records that form a part of the audit trail of data flow into, through, and out of XXXXXXX processing systems.
? Ledgers and other documents pertaining to the reconciliation of the general ledger accounts in the service centers with the money balances on CADE and the master files maintained on magnetic tape at XXXXXXX.
? Card files, tickler files and other types of files used to record action taken and control workflow.
* CADE Data Archive –This data will be retained until the next archive, it is expected that archives will take place annually.
? CADE will archive unneeded data from the production environment. This data can be restored to the Production Support environment, if needed to support GAO audits. This data is Balance and Control, Send to MQ Status, and Statistics data.  Archived data will not include taxpayer data.
* CADE Data Backups (final updated tape(s) for each calendar year) – Scratch after six months. Magnetic tape files containing current records for all taxpayers. Contains the balance, status, and transactions applicable to the individual accounts during a specific tax period. This includes returns filed, amendments to returns, assessments, debit and credit transactions.
* CADE Data Backups (all other weekly and daily backups) – Scratch after successful completion of third update cycle. Magnetic tape file containing current records for all taxpayers. Contains the balance, status, and transactions applicable to the individual accounts during a specific tax period. This includes returns filed, amendments to returns, assessments, debit and credit transactions.

Retention periods for these files are controlled by IRM 1.15.19-1, Records Control Schedule for XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX, Item 32.Work Files (XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX).  work Files include:  Interim Processing Media; Control Media; Print/Edit Media; Program Media; Special Project Media; Test Media; Checkpoint Media; Other Agency Media; and Unclassified Media.  Retention period for these files is defined as “RELEASE for re-use when no longer needed in accordance with IRM Section 2800.”

4.2.b. What are the procedures for eliminating the data at the end of the retention period?  Where are the procedures documented?

Procedures for eliminating data at the end of retention period - Records are maintained in accordance with Records Disposition Handbooks, IRM 1.15.2.12.1 through IRM 1.15.2.12.50. The following procedures are in effect at XXXXXXXXXXX to dispose of data:
* XXXXXXX SOP No. 2.2.8-26 (Rev. 17), Degaussing Media.
* IRM 2.2.8, Magnetic Media Management.
* XXXXXXX SOP No. 2.2.8-29 (Rev. 21), Magnetic Media Rehabilitation.
* XXXXXXX Security Handbook Issuance No. 306, Destruction of Data.
* IRM 2.1.10, Information Systems Security.
* IRM 2.2.8, Magnetic Media Management.

CADE’s media sanitization and disposal are considered a common control and the procedures in place for CADE are those in place at XXXXXXX. IRS follows disk sanitization procedures for destruction of discarded media.  IRM 2.7.4, Management of Magnetic Media (Purging of SBU Data and Destruction of Computer Media) provides those procedures used for sanitizing electronic media for reuse (e.g., overwriting) and for controlled storage, handling, or destruction of spoiled media or media that cannot be effectively sanitized for reuse (e.g., degaussing).  The responsibilities for management and employees for the care, cleaning, rehabilitation, storage, shipment, receipt, inspection, repair, destruction and security of all magnetic media is addressed.  Coverage includes all round reels, cartridges, cassettes, removable disks, optical disks, hard drives, etc.  IRM 2.7.4 also discusses duties, responsibilities, and procedures expected of all IRS sites that own, control, maintain, receive, ship, transmit, or inventory magnetic media. CADE’s offsite tapes are not encrypted in transit.

4.2.c. While the data is retained in the system, what are the requirements for determining if the data is still sufficiently accurate, relevant, timely, and complete to ensure fairness in making determinations?

CADE will be the custodian of the most accurate and timely taxpayer and tax return information available to assist the IRS in Tax Administration. To help ensure timeliness, accounts must be updated for each transaction, returns and payments must be processed daily, and refunds must by provided to FMS daily. Balancing and reconciliation processes will be used to ensure completeness.

4.3.a Is the system using technologies in ways that the IRS has not previously employed (e.g. Caller-ID)?

CADE is not employing any new technologies in Release 2.2.

4.3.b How does the use of this technology affect taxpayer/employee privacy?

CADE is not employing any new technologies in Release 2.2.

4.4.a Will this system provide the capability to identify, locate, and monitor individuals?  If yes, explain.

CADE maintains tax records for certain classes of taxpayers and therefore can be used to  identify and locate individuals. However it is neither designed to nor used to identify, locate, and monitor individuals. Information is used only for the purposes identified in this PIA and in the Privacy Act notices There are administrative, technical and physical constraints to prevent this functionality.

4.4.b. Will this system provide the capability to identify, locate, and monitor groups of people? If yes, explain.

CADE maintains tax records for certain classes of taxpayers.

4.4.c. What controls will be used to prevent unauthorized monitoring?

As noted earlier, there are not sufficient controls to enforce UNAX policies. However the following controls help prevent unauthorized monitoring:

* Clear separation of duties between the RACF Administrators, DBAs, Operations Personnel, and the Security Auditor. These positions require complete separation of duties, for example a RACF Administrator can only be the Security Auditor if that RACF Administrator works in a different computing center and has only Auditor privileges at the computing center being audited.
* Other access to the RACF system-level audit information is restricted to those people with a strict need-to-know, such as the Security Auditor and the Treasury Inspector General for Tax Administration (TIGTA).
* The audit information is protected using a RACF profile.
* UNAX and privacy training is required for all personnel, IRS and contractor.
* All administrators are subject to UNAX and current security policies regarding disclosure of information. (NOTE: While all administrative users of CADE are subject to UNAX policies, CADE has an identified risk regarding inadequate UNAX protections).

* Disclosure of returns and return information may be made only as provided by (1) 26 U.S.C. 3406, and (2) 26 U.S.C 6103.

4.5.a Under which Systems of Record Notice (SORN) does the system operate?  Provide number and name.

CADE is currently covered by Treasury/IRS 24.030, CADE Individual Master File (IMF), as well Business Master File (BMF)-Treasury/IRS 24.046 and Treasury/IRS 34.047 IRS Audit Trail & Security Records System SORNs.

4.5.b. If the system is being modified, will the SORN require amendment or revision? Explain.

IRS Office of Disclosure has determined that CADE will not require a new System of Records Notice (SORN).

 


Page Last Reviewed or Updated: September 17, 2007