wiki:Internal/StudentPages/XiruoLiu
close Warning: Can't synchronize with repository "(default)" ("(default)" is not readable or not a Git repository.). Look in the Trac log for more information.

Version 16 (modified by sissiok, 12 years ago) ( diff )

--

Transition probability matrix generation:

  1. convert timestamp to serial number and then sort
  2. divide area into grids
  3. loop: for each taxiid, find current grid & next grid, fill into grid matrix (row is current grid #, column is next grid #)
  4. get probability matrix by normalizing grid matrix

Update generation:

  1. set parameters, such as time period (1 day / 1440 mins), GUID (from 1 to 4000), grid number, longitude and latitude range
  2. initialize the taxi matrix ( GUID | current grid | timestamp), assign current grid according to location matrix which keeps density information about the taxi location, set timestamp to 0
  3. in every min (timestamp + 1), check every GUID whether it generates update
  4. compute destination grid number through transition probability matrix (convert to probability CDF matrix); if it is different from source grid number, then generate an update
  5. write | event type | GUID | source | destination | timestamp | into output file

table TAXIDATA has all data loaded, table TAXI1 loads only the first data file

table contents are as below
CREATE TABLE TAXIDATA
(
ID NUMBER(10) CONSTRAINT TAXIDATA_ID NOT NULL,
TAXIID NUMBER(7),
LONGITUDE NUMBER(9,6),
LATITUDE NUMBER(8,6),
SPEED NUMBER(3),
ANGLE NUMBER(3),
DATETIME TIMESTAMP(6),
STATUS NUMBER(1),
EXTENDSTATUS NUMBER(1),
REVISED NUMBER(1),
PRIMARY KEY(ID) )
TABLESPACE USERS;


show locations in most condense area which is divided into 100 grids
The picture shows 10k entries chosen from the first data file. The covered area is longitude from 121.2 to 121.8 and latitude from 31 to 31.5. The area is divided into 10*10 grids, which is 100 grids in total.


Discussion:

  1. model traffic and speed
  2. network topology layout
  3. integrate update and query generation; GNRS performance evalution

Modeling steps:
1.initialize grids: each grid has a GNRS server, distribute taxies according to location matrix from SUVNet, insert mapping information into GNRS servers (K copies, including local GNRS server and K-1 servers in other grids).
2.generate updates list(according to transition probability matrix) and lookups list(according to modified zipf law), updates and lookups are interleaved in an ascending order of timestamp.
3.update: each update has K copies, one stored in local GNRS, other K-1 copies stored in other GNRS servers which is decided by hash function. Update latency is the distance between local grid and farthest grid of K-1 servers.
4.lookup: hash GUID at local GNRS server, lookup latency is the distance between local grid and nearest grid of K servers.
5.examine each update/lookup in the order of timestamp, calculate average update/lookup latency.

Attachments (1)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.