There is a temporary SQLite database that is accessible from the bridge nodes while we are bringing up MySQL, so that users can have a little better access to the Landsat data holdings.
SQLite provides basic SQL interface and is accessible from number of languages and more extensive documentation can be found here: http://www.sqlite.org/.
There are currently two main tables in the database, which contain links to the Landsat files on line and some additional metadata information.
The "landsat" table is created by inserting existing files into the database pariodically.
TABLE landsat( granule_id text, #This is the primary key a unique Landsat scene ID noloc_id text, #granule_id with the gsid stripped sensor text, #Which sensor (T or E) = (TM or ETM) satellite int, #4, 5, or 7 path int, row int, url text, #The path to the file on the filesystem acq_date date, #Acquisition date (by the satellite) = overpass date year int, #Year of data acquisition (overpass) doy int, #Day-of-year of acquisition (overpass) 1-366 gsid text, #Ground station id version int, #Processing version is_local int, #1 = online, 0 = in the backup system UNIQUE(granule_id) )
The "meta" table is created from parsed from bulk metadata obtained periodically from the USGS. More information on the metadata is in the Landsat Data Dictionary, but since this describes the overall internal metadata structure of the datasets, there is quite a bit more information than we have in the DB right now. The schema of the external metadata that our "meta" table is extracted from can be found here.
TABLE meta( scene_id VARCHAR(200), #Matches the granule_id in the landsat table sensor text, satellite int, path int, row int, ul_lat float, ul_lon float, ur_lat float, ur_lon float, ll_lat float, ll_lon float, lr_lat float, lr_lon float, center_lat float, center_lon float, acq_date date, processed_date date, #When this scene was last processed at USGS year int, #Year of data acquisition (overpass) doy int, #Day-of-year of acquisition (overpass) 1-366 cloud_cover int, #Cloud cover interval 0-9 (0 = 0 to 10%, 1 = 10 to 20%,...) cloud_cover_full float, #Cloud cover % for this scene (0.0 - 100.0) - not always available l1type VARCHAR(10), #L1G, L1GT, L1Gt, L1T, PR image_quality1 int, #For L7 - image quality for bands 1-6l (0=bad, 9=good) image_quality2 int, #For L7 - image quality for bands 6h, 7, 8 (0=bad, 9=good) acquisition_quality int,# 0=missing, 1=extremely poor, 9=excellent quality_band1 int, # 0=worst, 9=excellent quality_band2 int, # 0=worst, 9=excellent quality_band3 int, # 0=worst, 9=excellent quality_band4 int, # 0=worst, 9=excellent quality_band5 int, # 0=worst, 9=excellent quality_band6 int, # 0=worst, 9=excellent quality_band7 int, # 0=worst, 9=excellent UNIQUE(scene_id) )
Accessing the SQLite database
There are two basic ways you can interact with the database. Directly from the command line and from your code through the SQLite API. To do either you need to know the location of the database file which is in:
Access from the command-line
So from command line you can just do (assuming you are on bridge4):
bridge4> sqlite3 /nobackupp4/datapool/landsat/rt/DB/SQLite/landsat-sqlite.db
You will get the sqlite prompt and you can just type your queries in there. For more information, see the SQLite documentation.
Access from Python
To access the database from python, you will need to import the sqlite3 module, open the connection to the SQLite db and establish the cursor (a structure that will deal with the results as they are returned from the database).
import sqlite3 as sqlite; sqlf = '/nobackupp4/datapool/landsat/rt/DB/SQLite/landsat-sqlite.db'; #location of the db file conn = sqlite.connect(sqlf) curs = conn.cursor(); ... some queries and result manipulation curs.close(); conn.close();
And you are ready to run some queries.
Example queries from Python
Find all the Landsat7 data for 2009, path 34 and row 40:
import sqlite3 as sqlite; sqlf = '/nobackupp4/datapool/landsat/rt/DB/SQLite/landsat-sqlite.db'; conn = sqlite.connect(sqlf) curs = conn.cursor(); results = None; try: curs.execute("SELECT url FROM landsat WHERE path=? AND row=? AND year=? AND satellite=7", (34, 40, 2009)); results = curs.fetchall(); for r in results: print r; except sqlite.Error as e: print "An error occurred:", e.args; curs.close(); conn.close();
Query across the two tables to find data with less than 30% cloud cover
... curs.execute("SELECT url FROM landsat INNER JOIN meta ON granule_id = scene_id WHERE \ meta.cloud_cover < 3 AND landsat.path=? AND landsat.row=? AND landsat.year=? AND \ landsat.satellite=7", (34, 40, 2009)); results = curs.fetchall(); ...