Font Size: a A A

Research On House Price Forecasting With News Sentiment Analysis

Posted on:2019-12-05Degree:MasterType:Thesis
Country:ChinaCandidate:KEVIN JATI KURNIAJAYAFull Text:PDF
GTID:2428330566997990Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The advent of the Internet has changed the way how investors obtain news through their mobile phones,laptops or tablets.The sheer volume of information available online means that it becomes sensible to build an automated system which collects the news and performs text processing to check for any potential effect on the specific asset markets.In any country,housing market has a high influence on the economy.Housing market influences economic growth,affects the value of asset portfolio,and influences the profitability of financial institutions and the reliability of the financial system.Hence,there are many stakeholders that are involved in the housing market.This research analyse the housing market for London,New York,and Sydney.The data that will be used are average and median housing data,economic variables such as Gross Domestic Product,unemployment rate,Consumer Price Index,interest rate,money supply,stock market index,and number of new residences.In addition,relevant news concerning real estates in these three cities are collected from the Internet.We proposed three novel approaches in this study.The first algorithm is the SSASARIMAX(Signal Sentiment Analysis-Seasonal Autoregressive Integrated Moving Average with Exogenous Variables).The news are preprocessed,and then scored with words related to housing market and economy given importance.We then averaged these scores to produce one time series of news scores.This is then inputted to the SARIMAX model along with all the economic data to obtain the predicted house price.The second is the DL-SARIMAX(Deep Learning-Seasonal Autoregressive Integrated Moving Average with Exogenous Variables).There are two main components for this algorithm,namely: news analysis,and traditional prediction model using the economic data.Instead,positive and negative news from each cities are manually selected.We then applied adjustment formula based on these two time series to the predicted house price from the SARIMAX model to produce the newly adjusted house price.The third is called DLC-SARIMAX(Deep Learning with Clustering-Seasonal Autoregressive Integrated Moving Average with Exogenous Variables).There are also two main components for this algorithm,namely: news analysis and traditional prediction model.For the news component,the deep learning approach called the Distributed Memory is used.From here,we can obtain the document vectors for all the news.We then cluster the similar documents into different groups.We then applied adjustment formula based on these groups to the predicted house price from the SARIMAX model to produce the newly adjusted house price.The prediction results from these three proposed algorithms are then compared with other prediction methods combined with various sentiment analysis methods.The best performance from all the experiments is the DLC-SARIMAX where it can improve RMSE results by 15.6% compared to baseline SARIMAX algorithm.The second best is the DLSARIMAX where it can reduce errors by 14.8%,while the SSA-SARIMAX showed an unstable results,where it only improve the prediction results in some of the experiments.
Keywords/Search Tags:sentiment analysis, data collection, machine learning, real estate, prediction
PDF Full Text Request
Related items