February 15, 2019
CREATED_AT | TREE_ID | BLOCK_ID | THE_GEOM | TREE_DBH | STUMP_DIAM | CURB_LOC | STATUS | HEALTH | SPC_LATIN | SPC_COMMON | STEWARD | GUARDS | SIDEWALK | USER_TYPE | PROBLEMS | ROOT_STONE | ROOT_GRATE | ROOT_OTHER | TRNK_WIRE | TRNK_LIGHT | TRNK_OTHER | BRNCH_LIGH | BRNCH_SHOE | BRNCH_OTHE | ADDRESS | ZIPCODE | ZIP_CITY | CB_NUM | BOROCODE | BORONAME | CNCLDIST | ST_ASSEM | ST_SENATE | NTA | NTA_NAME | BORO_CT | STATE | LATITUDE | LONGITUDE | X_SP | Y_SP |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
'08/27/2015' | 180683 | 348711 | 'POINT (-73.84421521958048 40.723091773924274)' | 3 | 0 | 'OnCurb' | 'Alive' | 'Fair' | 'Acer rubrum' | 'red maple' | 'None' | 'None' | 'NoDamage' | 'TreesCount Staff' | 'None' | 'No' | 'No' | 'No' | 'No' | 'No' | 'No' | 'No' | 'No' | 'No' | '108-005 70 AVENUE' | '11375' | 'Forest Hills' | 406 | 4 | 'Queens' | 29 | 28 | 16 | 'QN17' | 'Forest Hills' | 4073900 | 'New York' | 40.72309177 | -73.84421522 | 1027431.14821 | 202756.768749 |
'09/03/2015' | 200540 | 315986 | 'POINT (-73.81867945834878 40.79411066708779)' | 21 | 0 | 'OnCurb' | 'Alive' | 'Fair' | 'Quercus palustris' | 'pin oak' | 'None' | 'None' | 'Damage' | 'TreesCount Staff' | 'Stones' | 'Yes' | 'No' | 'No' | 'No' | 'No' | 'No' | 'No' | 'No' | 'No' | '147-074 7 AVENUE' | '11357' | 'Whitestone' | 407 | 4 | 'Queens' | 19 | 27 | 11 | 'QN49' | 'Whitestone' | 4097300 | 'New York' | 40.79411067 | -73.81867946 | 1034455.70109 | 228644.837379 |
'09/05/2015' | 204026 | 218365 | 'POINT (-73.93660770459083 40.717580740099116)' | 3 | 0 | 'OnCurb' | 'Alive' | 'Good' | 'Gleditsia triacanthos var. inermis' | 'honeylocust' | '1or2' | 'None' | 'Damage' | 'Volunteer' | 'None' | 'No' | 'No' | 'No' | 'No' | 'No' | 'No' | 'No' | 'No' | 'No' | '390 MORGAN AVENUE' | '11211' | 'Brooklyn' | 301 | 3 | 'Brooklyn' | 34 | 50 | 18 | 'BK90' | 'East Williamsburg' | 3044900 | 'New York' | 40.71758074 | -73.9366077 | 1001822.83131 | 200716.891267 |
'09/05/2015' | 204337 | 217969 | 'POINT (-73.93445615919741 40.713537494833226)' | 10 | 0 | 'OnCurb' | 'Alive' | 'Good' | 'Gleditsia triacanthos var. inermis' | 'honeylocust' | 'None' | 'None' | 'Damage' | 'Volunteer' | 'Stones' | 'Yes' | 'No' | 'No' | 'No' | 'No' | 'No' | 'No' | 'No' | 'No' | '1027 GRAND STREET' | '11211' | 'Brooklyn' | 301 | 3 | 'Brooklyn' | 34 | 53 | 18 | 'BK90' | 'East Williamsburg' | 3044900 | 'New York' | 40.71353749 | -73.93445616 | 1002420.35833 | 199244.253136 |
'08/30/2015' | 189565 | 223043 | 'POINT (-73.97597938483258 40.66677775537875)' | 21 | 0 | 'OnCurb' | 'Alive' | 'Good' | 'Tilia americana' | 'American linden' | 'None' | 'None' | 'Damage' | 'Volunteer' | 'Stones' | 'Yes' | 'No' | 'No' | 'No' | 'No' | 'No' | 'No' | 'No' | 'No' | '603 6 STREET' | '11215' | 'Brooklyn' | 306 | 3 | 'Brooklyn' | 39 | 44 | 21 | 'BK37' | 'Park Slope-Gowanus' | 3016500 | 'New York' | 40.66677776 | -73.97597938 | 990913.775046 | 182202.425999 |
↓
0101010010111010101010001010101001101001001010010001010101001...
How is data stored?
Problem 1: How should you encode one tuple?
,
) between fieldsProblem 2: How should you encode a file of tuples?
\r\n
) at record endProblem 2.b: How should you store records in a page?
Goal 1: Where is record $X$?
Goal 2: Support updates/deletions
What about variable-size records?
Why store the key and records from opposite ends?
Problem 3: How should you organize pages in a file?
Key question: What happens when all records on a page are deleted?
Idea: Track empty pages.
Each file stores 2-tuples $\left< RowID, Value\right >$.
Values only for one attribute.
Great for wide, rarely-updated tables where only a few attributes are used per-query
Abstract the messy details of File-IO
Is the page in the buffer pool?
Does this all sound familiar?
Isn't this just Virtual Memory?
(Many databases use memory-mapped files as a buffer manager)
Databases can predict the future!
SELECT * FROM R WHERE A > 500 AND A < 2000
→ Pages 10-12
How do we decide which pages hold the query results?
Answers next class!