MySQL Cookbook Free Open Book

MySQL Cookbook

Previous Section Next Section

7.16 Date-Based Summaries

7.16.1 Problem

You want to produce a summary based on date or time values.

7.16.2 Solution

Use GROUP BY to categorize temporal values into bins of the appropriate duration. Often this will involve using expressions to extract the significant parts of dates or times.

7.16.3 Discussion

To put records in time order, you use an ORDER BY clause to sort a column that has a temporal type. If instead you want to summarize records based on groupings into time intervals, you need to determine how to categorize each record into the proper interval and use GROUP BY to group them accordingly.

Sometimes you can use temporal values directly if they group naturally into the desired categories. This is quite likely if a table represents date or time parts using separate columns. For example, the baseball1.com master ballplayer table represents birth dates using separate year, month, and day columns. To see how many ballplayers were born on each day of the year, perform a calendar date summary that uses the month and day values but ignores the year:

mysql> SELECT birthmonth, birthday, COUNT(*)
    -> FROM master
    -> WHERE birthmonth IS NOT NULL AND birthday IS NOT NULL
    -> GROUP BY birthmonth, birthday;
+------------+----------+----------+
| birthmonth | birthday | COUNT(*) |
+------------+----------+----------+
|          1 |        1 |       47 |
|          1 |        2 |       40 |
|          1 |        3 |       50 |
|          1 |        4 |       38 |
...
|         12 |       28 |       33 |
|         12 |       29 |       32 |
|         12 |       30 |       32 |
|         12 |       31 |       27 |
+------------+----------+----------+

A less fine-grained summary can be obtained by using only the month values:

mysql> SELECT birthmonth, COUNT(*)
    -> FROM master
    -> WHERE birthmonth IS NOT NULL
    -> GROUP BY birthmonth;
+------------+----------+
| birthmonth | COUNT(*) |
+------------+----------+
|          1 |     1311 |
|          2 |     1144 |
|          3 |     1243 |
|          4 |     1179 |
|          5 |     1118 |
|          6 |     1105 |
|          7 |     1244 |
|          8 |     1438 |
|          9 |     1314 |
|         10 |     1438 |
|         11 |     1314 |
|         12 |     1269 |
+------------+----------+

Sometimes temporal values can be used directly, even when not represented as separate columns. To determine how many drivers were on the road and how many miles were driven each day, group the records in the driver_log table by date:

mysql> SELECT trav_date,
    -> COUNT(*) AS 'number of drivers', SUM(miles) As 'miles logged'
    -> FROM driver_log GROUP BY trav_date;
+------------+-------------------+--------------+
| trav_date  | number of drivers | miles logged |
+------------+-------------------+--------------+
| 2001-11-26 |                 1 |          115 |
| 2001-11-27 |                 1 |           96 |
| 2001-11-29 |                 3 |          822 |
| 2001-11-30 |                 2 |          355 |
| 2001-12-01 |                 1 |          197 |
| 2001-12-02 |                 2 |          581 |
+------------+-------------------+--------------+

However, this summary will grow lengthier as you add more records to the table. At some point, the number of distinct dates likely will become so large that the summary fails to be useful, and you'd probably decide to change the category size from daily to weekly or monthly.

When a temporal column contains so many distinct values that it fails to categorize well, it's typical for a summary to group records using expressions that map the relevant parts of the date or time values onto a smaller set of categories. For example, to produce a time-of-day summary for records in the mail table, do this:[1]

[1] Note that the result includes an entry only for hours of the day actually represented in the data. To generate a summary with an entry for every hour, use a join to fill in the "missing" values. See Recipe 12.10.

mysql> SELECT HOUR(t) AS hour,
    -> COUNT(*) AS 'number of messages',
    -> SUM(size) AS 'number of bytes sent'
    -> FROM mail
    -> GROUP BY hour;
+------+--------------------+----------------------+
| hour | number of messages | number of bytes sent |
+------+--------------------+----------------------+
|    7 |                  1 |                 3824 |
|    8 |                  1 |                  978 |
|    9 |                  2 |                 2904 |
|   10 |                  2 |              1056806 |
|   11 |                  1 |                 5781 |
|   12 |                  2 |               195798 |
|   13 |                  1 |                  271 |
|   14 |                  1 |                98151 |
|   15 |                  1 |                 1048 |
|   17 |                  2 |              2398338 |
|   22 |                  1 |                23992 |
|   23 |                  1 |                10294 |
+------+--------------------+----------------------+

To produce a day-of-week summary instead, use the DAYOFWEEK( ) function:

mysql> SELECT DAYOFWEEK(t) AS weekday,
    -> COUNT(*) AS 'number of messages',
    -> SUM(size) AS 'number of bytes sent'
    -> FROM mail
    -> GROUP BY weekday;
+---------+--------------------+----------------------+
| weekday | number of messages | number of bytes sent |
+---------+--------------------+----------------------+
|       1 |                  1 |                  271 |
|       2 |                  4 |              2500705 |
|       3 |                  4 |              1007190 |
|       4 |                  2 |                10907 |
|       5 |                  1 |                  873 |
|       6 |                  1 |                58274 |
|       7 |                  3 |               219965 |
+---------+--------------------+----------------------+

To make the output more meaningful, you might want to use DAYNAME( ) to display weekday names instead. However, because day names sort lexically (for example, "Tuesday" sorts after "Friday"), use DAYNAME( ) only for display purposes. Continue to group on the numeric day values so that output rows sort that way:

mysql> SELECT DAYNAME(t) AS weekday,
    -> COUNT(*) AS 'number of messages',
    -> SUM(size) AS 'number of bytes sent'
    -> FROM mail
    -> GROUP BY DAYOFWEEK(t);
+-----------+--------------------+----------------------+
| weekday   | number of messages | number of bytes sent |
+-----------+--------------------+----------------------+
| Sunday    |                  1 |                  271 |
| Monday    |                  4 |              2500705 |
| Tuesday   |                  4 |              1007190 |
| Wednesday |                  2 |                10907 |
| Thursday  |                  1 |                  873 |
| Friday    |                  1 |                58274 |
| Saturday  |                  3 |               219965 |
+-----------+--------------------+----------------------+

A similar technique can be used for summarizing month-of-year categories that are sorted by numeric value but displayed by month name.

Uses for temporal categorizations are plentiful:

  • DATETIME or TIMESTAMP columns have the potential to contain many unique values. To produce daily summaries, strip off the time of day part to collapse all values occurring within a given day to the same value. Any of the following GROUP BY clauses will do this, though the last one is likely to be slowest:

    GROUP BY FROM_DAYS(TO_DAYS(col_name))
    GROUP BY YEAR(col_name), MONTH(col_name), DAYOFMONTH(col_name)
    GROUP BY DATE_FORMAT(col_name,'%Y-%m-%e')
  • To produce monthly or quarterly sales reports, group by MONTH(col_name) or QUARTER(col_name) to place dates into the correct part of the year.

  • To summarize web server activity, put your server's logs into MySQL and run queries that collapse the records into different time categories. Chapter 18 discusses how to do this for Apache.

    Previous Section Next Section
    Index: [SYMBOL][A][B][C][D][E][F][G][H][I][J][K][L][M][N][O][P][Q][R][S][T][U][V][W][X][Y][Z]


         Main Menu
    Main Page
    Table of content
    Copyright
    Preface
    Chapter 1. Using the mysql Client Program
    Chapter 2. Writing MySQL-Based Programs
    Chapter 3. Record Selection Techniques
    Chapter 4. Working with Strings
    Chapter 5. Working with Dates and Times
    Chapter 6. Sorting Query Results
    Chapter 7. Generating Summaries
    7.1 Introduction
    7.2 Summarizing with COUNT( )
    7.3 Summarizing with MIN( ) and MAX( )
    7.4 Summarizing with SUM( ) and AVG( )
    7.5 Using DISTINCT to Eliminate Duplicates
    7.6 Finding Values Associated with Minimum and Maximum Values
    7.7 Controlling String Case Sensitivity for MIN( ) and MAX( )
    7.8 Dividing a Summary into Subgroups
    7.9 Summaries and NULL Values
    7.10 Selecting Only Groups with Certain Characteristics
    7.11 Determining Whether Values are Unique
    7.12 Grouping by Expression Results
    7.13 Categorizing Non-Categorical Data
    7.14 Controlling Summary Display Order
    7.15 Finding Smallest or Largest Summary Values
    7.16 Date-Based Summaries
    7.17 Working with Per-Group and Overall Summary Values Simultaneously
    7.18 Generating a Report That Includes a Summary and a List
    Chapter 8. Modifying Tables with ALTER TABLE
    Chapter 9. Obtaining and Using Metadata
    Chapter 10. Importing and Exporting Data
    Chapter 11. Generating and Using Sequences
    Chapter 12. Using Multiple Tables
    Chapter 13. Statistical Techniques
    Chapter 14. Handling Duplicates
    Chapter 15. Performing Transactions
    Chapter 16. Introduction to MySQL on the Web
    Chapter 17. Incorporating Query Resultsinto Web Pages
    Chapter 18. Processing Web Input with MySQL
    Chapter 19. Using MySQL-Based Web Session Management
    Appendix A. Obtaining MySQL Software
    Appendix B. JSP and Tomcat Primer
    Appendix C. References
    Colophone
    Index


    More Books
    PHP Hacks
    Processing Xml With Java - A Guide To Sax, Dom, Jdom, Jaxp, And Trax
    The Koran (Holy Qur'an)
    Macromedia Flash 8 Bible
    Search Engine Optimization for Dummies
    YouTube Traffic
    PHP 5 for Dummies
    Harry Potter and The Chamber of Secrets
    Harry Potter and the Sorcerer's Stone
    The Pilgrim's Progress
    Wireless Hacks
    Flash Hacks. 100 Industrial-Strength Tips & Tools
    PayPal Hacks. 100 Industrial-Strength Tips and Tools
    Amazon Hacks
    Pdf Hacks
    The Da Vinci Code
    Google Hacks
    The Holy Bible
    Windows XP For Dummies
    Harry Potter and the Half-Blood Prince
    Seo Book
    Upgrading and Repairing Networks
    Macromedia Dreamweaver 8 UNLEASHED
    Windows XP Annoyances
    Windows XP Hacks
    Microsoft Windows XP Power Toolkit
    Teach Yourself MS Office In 24Hours
    iPod & iTunes Missing Manual
    PC Hacks 100 Industrial-Strength Tips and Tools
    PC Overclocking, Optimization, and Tuning - 2th Edition
    PC Hardware In A Nutshell 3rd Edition
    PC Hardware in a Nutshell, 2nd Edition
    Upgrading and Repairing PCs
    Google for Dummies
    MySQL Cookbook
    Teach Yourself Macromedia Flash 8 In 24 Hours
    PHP CookBook
    Sams Teach Yourself JavaScript in 24 Hours
    PHP5 Manual
    Free Games Paper Airplanes
    500 Juegos Gratis 500 Giochi Gratis 500 Jeux Gratuits 500 Jogos Gratis 500 Kostenlose Spiele