事件感知型旅程规划平台的设计与实现

时间: 2015-02-02 编号：sb201502022081 作者：蜂朝网
类别：在职硕士论文行业：字数：89601 点击量：1355
类型: 收费费用: 0元

本站提供专业的[留学生论文]定制业务，如需服务请，联系电话：13671516250.

文章摘要：
本文是在职硕士论文，Event Awareness Trip Plan Platform provides an effective way of smartnavigation service, which considers not only the static information of roads andtransportations, but also the dynamical events relating to traffic. To detect theevents like congestions, traffic accidents and road maintenances.

Chapter 1 Introduction

One of the main concerns addressed by the European program HORIZON2020[1]is the development of innovative solutions for sustainable transport andmobility, with particular emphasis on cities. In fact, nowadays, cities are growingmore and more: it is estimated that more than 60% of the European population livesin cities. This enormous concentration of people in small areas has highlighted someimportant issues, such as inadequate transport systems, inadequate city services, andincreasing pollution[2].With rapid urbanizing process, Smart City is the future aiming to offer highquality of life to people living, working and traveling the city. Smart mobility is onepillar of the Smart City foundation, on the new techniques to be integrated withmodern information system and communication networks. In particular, informationis the key to realize smart mobility (Figure 1-1).As shown in Figure 1-1, huge amount of data coming from different sourcesprovide the possibility to assistant people’s daily life with forecasting and analyzingrelated information[3]. Based on such information, smart mobility contributes to ourdaily life in terms of public transportation, city crowed sourcing management andindividual trip plan. In this smart mobility system, there are several types of datasources: social data, open data, traditional web content and others. How to use thoseheterogeneous data to efficiently support individual mobility? Especially, with thehigh popularity of social network, information spreads much faster than before andpeople can be informed what is happening nearby in near real time.The use of the Information and Communication Technologies (ICTs) can be ofthe utmost importance for promoting new sustainable smart city systems with lowenvironmental impact, and thus improving the lives of citizens[4]. In suchperspective, a Smart City is the outcome of an integration of various systems,namely Traffic Management Systems, Transport Systems, User Systems, Vehicle Systems[5]. In this thesis, we mainly focus on the design and implementation ofTransport Systems of Smart City. In transport systems for smart cities, surprisingly,light duty vehicles, i.e. cars, by far are not only the most used transport but also thefastest growing one. Buses, rail and low-pollution transport, as bikes, are growingmuch slower[5]. Therefore, it is necessary to create a solution of smart mobility forvehicles.

.........

Chapter 2 State of Art

2.1 Research of Multiple Data Source Extracting

For traffic related event data, they can derive from various sources, namelysocial media, web resources, public data sources and private data sources. Here arethe introductions of each data source:1. Social media.The data from social media, such as Twitter, Facebook, Weibo and GooglePlus, is collectively called social data. Although social networks like Twitter is arelatively new source of information, research on using this type of contents in t hecontext of situational awareness, crisis management and security-relatedintelligence gathering has received much attention. Owing to the development ofInternet, people are more likely to publish their statuses and spread news. Socompared to other source, data size from social media is large and they are almost inreal time. In other hand, most of them are unstructured and don’t have a specifictopic. As a result, social data often need further process to extract information. To simple data retrieval from social media, these websites often provideseveral APIs, like Facebook API, Twitter API and so on. Using these APIs, usersdon’t need to concern how data is stored and organized, or where they can get data.Let me show you an example of Twitter API.[8]APIs to access Twitter data can be classified into two types based on theirdesign and access method: REST APIs are based on the REST architecture now popularly used fordesigning web APIs. These APIs use the pull strategy for data retrieval. To collectinformation a user must explicitly request it. Streaming APIs provides a continuous stream of public information fromTwitter. These APIs use the push strategy for data retrieval. Once a request forinformation is made, the Streaming APIs provide a continuous stream of updateswith no further input from the user.They have different capabilities and limitations with respect to what and howmuch information can be retrieved. The Streaming API has three types of endpoints: Public streams: These are streams containing the public tweets on Twitter. User streams: These are single-user streams, with to all the Tweets of auser. Site streams: These are multi-user streams and intended for applicationswhich access Tweets from multiple users.2. Web resources.Web resources are the general name of articles or news on the net. Comparedto social data, they are always well organized and focus on a certain topic. Such astraffic related news, when and where an event occurred are clearly recorded in it.But it often has a high delay since most of them are published after investigation asa summary.For data retrieval from web resources, web crawlers are often used. A Webcrawler is an Internet bot that systematically browses the World Wide Web,typically for the purpose of Web indexing. A Web crawler may also be called a Webspider, an ant, or an automatic indexer.[9]A Web crawler starts with a list of URLs to visit. It recognizes pages to extracthyperlinks and put these hyperlinks into URL list for further access. A schedule isoften used to deliver the task of analysis. Here are some examples of the webcrawler: DataparkSearch is a crawler and search engine released under the GNU General Public License. Heritrix is the Internet Archive's archival-quality crawler,designed for archiving periodic snapshots of a large portion of the Web. It waswritten in Java. Nutch is a crawler written in Java and released under an ApacheLicense. It can be used in conjunction with the Lucene text-indexing package.Scrapy, is an open source web crawler framework, written in python. It is acrawling framework to identify the contents on the website with high performanceand flexibility. It is easy to use owing to the high specification encapsulation.Scrapy is widely used in many fields, such as data mining, automated testing[10].3. Public data sources.Public data sources, in other name, crowd sourcing data, is a data set createdby all the people. City Feed is a system like this to gather city events from users. Ifa user find there a new event occurs, he can post it to the system and share it toothers. The difference between social data and crowd sourcing data are the formerprovide interactive information between users and the latter concentrates on acertain scope of topics. Moreover, social data is often some description withoutclear properties, but crowd sourcing data records geo-info, time-info and type-infoitself. For local events tracking, it is better to use public data sources, but in otherhand, the number of users which share public data is much less than those in socialmedia, which leads to the data amount of public source cannot meet demands,especially for a certain topic.To get data from public data sources, web services are widely used. A Webservice is a method of communication between two electronic devices over anetwork. It is a software function provided at a network address over the web withthe service always on as in the concept of utility computing. The W3C defines aWeb service generally is a system that served the interaction of between variousmachines[11]. It uses XML (Extensible Markup Language) format to serialize datathrough HTTP protocol. To interact with each other, the WSDL and SOAPmessages are defined to describe the interface of the service with detailedinformation.There are several data formats used in service interactions; XML and JSON aretwo of the most popular formats. Extensible Markup Language (XML) is a markuplanguage that defines a set of rules for encoding documents in a format that is bothhuman-readable and machine-readable. It is defined in the XML 1.0 Specificationproduced by the W3C, and several other related specifications, all free open standards. JSON, or JavaScript Object Notation, is an open standard format thatuses human-readable text to transmit data objects consisting of attribute–value pairs.It is used primarily to transmit data between a server and web application, as analternative to XML.

2.2 Research of Event Aware Trip Planning

The default routing algorithm in OTP is the A* algorithm[35]which utilizes acost-heuristic to prune the Dijkstra search[36]. At every considered intermediatelocation (between start and target location) the cost-heuristic estimates a lowerbound of the remaining travel costs to the target. The cost estimate for traversingthis intermediate location is calculated using the sum of the costs to the location andthe estimated remaining costs. In the formula to calculate the cost of path, A* usesg(n) to indicate the path cost from the start point to the vertex n and h(n) to identifythe estimation of cost from vertex n to the destination.City Trip Planner is a Web application that provides customers trip planningservice based on the interests and context of users. The system includes five citiesin Flanders, Belgium. The APP is mainly facing the requirement of travelers; themost power function is the application could plan a good route for users, whichcovers most of their interest places. When using City Trip Planner, users can inputtheir interest to get a better solution to cover all places they want to travel. Theitineraries also based on the open time of different tour scenic spots[37]. This systemonly concentrates on travel navigation, but not daily trip planning.SEI-Tur is a recommender system that uses Web services in order to createcomplete and consistent tours, based on user preferences, combining travel, lodging,food supply and local event programming. For each component of the tour, domainontology has being developed and the discovery and selection of the best services isbased on a sophisticated service selection and composition tool[38]. This system ismore flexible compared with City Trip Planner, since it can recommend new spotsconsidering the popularity. If there are some large celebrations in the city, they willalso be added to the itinerary. While it only records events that last for a long periodand ignores traffic related events in real time.Google Transit is a public transport route planner. The coverage of GoogleTransit is publicly available. It is spread worldwide, in hundreds of cities andsometimes in entire countries such as China, Japan, Switzerland. The coverage ofmajor cities in the United States and Canada is almost exhaustive. It providesnavigation using public transportation data from private sources and helps userschoose the best way to go out automatically. However, users still cannot use it toplan trips using car with ability to aware local events[39, 40].

CHAPTER 3 SYSTEM REQUIREMENT ANALYSIS......................................20

3.1 THE GOAL OF THE SYSTEM ........................................20

3.2 FUNCTIONAL REQUIREMENTS OF THE SYSTEM............................................23

CHAPTER 4 SYSTEM DESIGN ............................29

4.1 THE STRUCTURE OF MODULE FUNCTION ....................................................29

4.2 DESIGN OF EVENT EXTRACTION SYSTEM....................................................31

CHAPTER 5 SYSTEM IMPLEMENTATION AND TESTING ........................59

5.1 THE ENVIRONMENT OF SYSTEM IMPLEMENTATION.....................................59

Chapter 5 System Implementation and Testing

5.1 The Environment of System Implementation

NLTK was started to support a computational linguistics course in 2001 at theUniversity of Pennsylvania. Now it is widely used in many research groups and labsbenefit from the continuous development and improvement. Here are several mostimportant modules in NLTK, such as nltk.corpus, which is a general interface tolexicons, nltk.stem and nltk.tag for word stemming and POSTagging, alsonltk.classify as implementations of the classification algorithms.I use NTLK because of its four features: Modularity: The module is independent to others so that we can use itseparately. Extensibility: Each function is implemented in many ways with differentpolicies and easy to add new solutions. Similarity: NLTK is a framework that help users definite the generalknowledge of natural language processing. The different modules can be used asLego bricks to compose a complete solution. Consistency: It provides self-commented interfaces benefiting from theireasily assimilated names, which are difficult to displace.Although there are some advantages mentioned above, it is undeniable thatNLTK still has some shortcomings. First, it is a framework with many components,but not a ready-made system. It just provides the implantation of some algorithms,but you need prepare the training data set. In other word, it only guarantees theaccuracy of the algorithm, but not the result. Second, the toolkit is implemented byoriginal Python code instead of C or C++, which keeps it simple and easy tounderstand. As a result the performance is not highly optimized but still efficient.But considering the system focuses more on research work, the performance ofNLTK is acceptable.

5.2 Key Program Flow Charts

As shown in Figure 5-1, the platform is composed of two subsystems, the EventExtraction System and the Trip Planner. Extraction system collects data fromexternal system and extracts events that will be stored in the database. Trip Plannerprovides navigation service with the consideration of event influence. It computesthe cost of paths and ranks plans by cost and finally returns the best itineraries tothe user.Detailed processing steps are as follows:1. The step of extracting event.Event extraction system gets data from external systems and outputs eventscontinuously. First, it asks Twitter server to verify its authority and makes a dataconnection. When receiving data it checks the validity of data to take more actions.At the same time, extraction system uses a Python crawler to capture traffic-relatednews on the website. After getting text contents from two sources, it extracts eventsfrom these texts by the technique of natural language processing. The extractedevents are merged with that from City Feed system. Then the combined events willstored in database.2. The step of requesting navigation. In fact user requests appear all the time when the system is getting data fromexternal systems, here we only describe the process in a logical sequence. First ofall, user can modify the parameters of trip planning request, such as the start point,the destination and the type of transportation. Trip Planner packs all these attributesas a context and loads map resources. It also searches the database to get the set ofvalid events.3. The step of finding path.When Trip Planner gets the valid events, it starts to search the map from start toend using A* algorithm. During this process it calculates cost for each vertex on thepath. It the influence of one vertex is not computed, it calculates the timeliness anddistance of event to get the final impact value. This value will be added into the costof path. The candidate itineraries will be ranked based on their cost and the leastone will be returned.4. The step of responding requests.The returned itinerary is displayed on the map interface. It shows the timeschedule and bus stops for bus transferring if user selects “By Bus” option.The steps of event extracting and path finding are the most significantprocedure in the whole process. For event extraction, the data retrieval of Twitterand the analysis of raw text are the most difficult parts. We will introduce these twoparts in section 4.2.2 and section 4.2.3. For itineraries search, the implementation ofA* algorithm is the key to success, and this will be discussed in section 4.2.4.

........

Conclusion

Event Awareness Trip Plan Platform provides an effective way of smartnavigation service, which considers not only the static information of roads andtransportations, but also the dynamical events relating to traffic. To detect theevents like congestions, traffic accidents and road maintenances, the EventExtraction System, a subsystem of the plat, uses data collection module to get datafrom Twitter, websites and City Feed system through API and web services. Sincemost data is unstructured, extraction system does some analysis word on these textsby sematic analysis module and extracts events from the normalized contents. Afterextraction, events integration is necessary to merge similar events from differentsources. Finally events are stored in database.Trip Planner is another subsystem of platform. It deals with user’s request fortrip planning and is designed based on OpenTripPlanner. It enhances the A* searchalgorithm by changing its heuristic formula for cost calculation. During the searchprocess, the value of influence between events and points are computed to supportcost calculation. By ranking the found paths, it returns the least cost one to user. Italso has a web interface to display events and itineraries of trip.The project is deployed online and works fine, which proves that our design isfeasible. But there are still some problems available. Firstly, current solutioncalculates the event influence during the search process, but in fact, for mostvertices on the map, the impacts of events to it remain stable within a period of time.So the current influence computation becomes a repetitive job to a certain degree.Secondly, although it cannot avoid events if the user takes the trip by publictransportation, the system should notify user the potential cost. But for now, it saysnothing.The future work includes improving the method to calculate the event influenceand pushing notification of events when user uses public transportations to go out.For some events exist for a long time, we can pre-calculate its impacts and use thesevalue directly in path finding process. In this way, people can use the trip planservice without worrying the traffic and enjoy the trip. In the future, with more datasources, I believe it will be smarter and smarter and help people save a lot of time.

...........

参考文献（略）

如需定做,在职硕士论文请联系我们专家定制团队，QQ337068431，热线咨询电话：021-62170626

分享到：

标签：在职硕士论文智慧城市旅程规划事件感知事件提取自然语言处理