Processing Xml With Java - A Guide To Sax, Dom, Jdom, Jaxp, And Trax Free Open Book

Processing Xml With Java - A Guide To Sax, Dom, Jdom, Jaxp, And Trax

Previous Section Next Section

The Budget

As an example of this process, I'm going to use U.S. federal government budget authorization data, which the Office of Management and Budget (OMB) publishes in a variety of equivalent flat formats, even though the data itself is relatively unflat. This is a good example of the sort of legacy data developers often have to deal with. The complete document [http://w3.access.gpo.gov/usbudget/fFY2002/db.html] consists of 3,185 line items. Each line item consists of 43 separate fields. In the comma-separated values (CSV) version of the file, a typical line item looks like this:

"418","National Endowment for the Humanities","00","National Endow-
ment for the Humanities","0200","National Endowment for the Humani-
ties: grants and administration","59","503","Research and general
education  aids","Discretionary","On-budget",   0,  0,   0, 121275,
145231,  150100,  151299,  130560, 135447, 140118,  139478, 132582,
138890,  140435,  153000,  156910, 170002, 175955,  177413, 177491,
172000,  110000,  110000,  111000, 112000, 115000,  120000, 121000,
124000, 126000, 129000,  132000

Each field is separated from the following field by a comma. Strings are enclosed in double quotes, and may contain commas that do not delimit fields. Dollar amounts are expressed as integers divided by 1,000. That is, the last value in the above line is 132 million dollars, not 132 thousand dollars. Table 4.1 identifies the 43 separate fields.

Table 4.1. Public Budget Database Field Descriptions
Field Number Field Name Description
1 Agency code A unique, three-digit, numerical code for the cabinet department or independent agency.
2 Agency name The name of the cabinet-level department (e.g., Department of Defense) or independent agency (e.g., Peace Corps). Even though only the executive branch has true agencies, for purposes of the budget, offices within the legislative branch are given the agency name "Legislative Branch" and offices within the judicial branch are given the agency name "Judicial Branch." Agency names have a maximum of 89 characters.
3 Bureau code A two-digit numerical code for the bureau within the cabinet department or independent agency; bureau codes are unique only within an agency.
4 Bureau name The name of the suboffice within the agency (e.g., Coast Guard or Federal Aviation Administration). Budget amounts for the agency as a whole that are not part of a specific bureau are generally categorized under a fictional bureau with the same name as the agency. Bureau names have a maximum of 89 characters.
5 Account code A four-digit code (outlays) or six-digit code (offsetting receipts) for the account within the bureau.
6 Account name The name of the budgeted function within the bureau. Account names have a maximum of 160 characters.
7 Treasury Agency code A two-digit numerical code for the agency, assigned by the Treasury Department.
8 Subfunction code A three-digit numerical code for the subfunction within an account.
9 Subfunction title The name of the subfunction within the account. Subfunction titles have a maximum of 72 characters.
10 BEA category Budget Enforcement Act category: "Mandatory," "Discretionary," or "Net interest."
11 On- and off-budget indicator "On-budget" or "Off-budget"; Social Security trust funds and the Postal Service are off-budget; all other accounts are on-budget.
12 1976 value Actual amounts, in thousands of dollars, for fiscal year (FY) 1976. Budget authority is usually shown as a positive value. Offsetting receipts are usually negative values.
13 TQ value Actual amount, in thousands of dollars, for the "transitional quarter" in 1976 when the government shifted the start of its fiscal year ahead from July to October.
14–37 1977–2000 Actual amounts, in thousands of dollars, for each fiscal year from 1977 to 2000.
38–end 2001–2006 Estimated amounts, in thousands of dollars, for FY 2001 through FY 2006.
Source: Adapted from U.S. Office of Management and Budget. Budget Analysis Branch. Public Budget Database User's Guide: Budget of the United States Government, Fiscal Year 2002 [http://a257.gakamaitech.net/7/257/2422/09apr20010800/www.gpo.gov/usbudget/FY2002/pdf/db_guide.pdf], April 9, 2001, 8.

    Previous Section Next Section


         Main Menu
    Main Page
    Table of content
    Copyright
    Praise for Elliotte Rusty Harold's 'Processing XML with Java™'
    List of Examples
    List of Figures
    Preface
    Part I: XML
    Chapter 1. XML for Data
    Chapter 2. XML Protocols: XML-RPC and SOAP
    Chapter 3. Writing XML with Java
    Chapter 4. Converting Flat Files to XML
    The Budget
    The Model
    Input
    Determining the Output Format
    Building Hierarchical Structures from Flat Data
    Alternatives to Java
    Relational Databases
    Summary
    Chapter 5. Reading XML
    Part II: SAX
    Part III: DOM
    Part IV: JDOM
    Part V: XPath/XSLT
    Part VI: Appendixes


    More Books
    PHP Hacks
    Processing Xml With Java - A Guide To Sax, Dom, Jdom, Jaxp, And Trax
    The Koran (Holy Qur'an)
    Macromedia Flash 8 Bible
    Search Engine Optimization for Dummies
    YouTube Traffic
    PHP 5 for Dummies
    Harry Potter and The Chamber of Secrets
    Harry Potter and the Sorcerer's Stone
    The Pilgrim's Progress
    Wireless Hacks
    Flash Hacks. 100 Industrial-Strength Tips & Tools
    PayPal Hacks. 100 Industrial-Strength Tips and Tools
    Amazon Hacks
    Pdf Hacks
    The Da Vinci Code
    Google Hacks
    The Holy Bible
    Windows XP For Dummies
    Harry Potter and the Half-Blood Prince
    Seo Book
    Upgrading and Repairing Networks
    Macromedia Dreamweaver 8 UNLEASHED
    Windows XP Annoyances
    Windows XP Hacks
    Microsoft Windows XP Power Toolkit
    Teach Yourself MS Office In 24Hours
    iPod & iTunes Missing Manual
    PC Hacks 100 Industrial-Strength Tips and Tools
    PC Overclocking, Optimization, and Tuning - 2th Edition
    PC Hardware In A Nutshell 3rd Edition
    PC Hardware in a Nutshell, 2nd Edition
    Upgrading and Repairing PCs
    Google for Dummies
    MySQL Cookbook
    Teach Yourself Macromedia Flash 8 In 24 Hours
    PHP CookBook
    Sams Teach Yourself JavaScript in 24 Hours
    PHP5 Manual
    Free Games Paper Airplanes
    500 Juegos Gratis 500 Giochi Gratis 500 Jeux Gratuits 500 Jogos Gratis 500 Kostenlose Spiele