| With the rise of Web2.0concept, User Generated Content (UGC) has got a widespread concern. Accompanied by "The Big-data Age", the emerging social networking media, with Sina Weibo as a representative, will output a huge amount of UGC content every day. Such content is full of commercial and social value. If still use the traditional data mining systems to analyze these UGC content, it will be difficult to keep up with the speed of "The Age". So, how to provide a reliable big data processing platform will be a significant research project.Based on the research of Hadoop core technology and some sub-projects in its ecologicl environment, this paper implemented a scalable big data application systems architecture, which is used for dealing with a large-scale of data sets. By way of the hierarchical mode, this architecture is organized into four layers:the platform layer, data layer, business logic layer and application layer. Also from this perspective, this article intruduced the details and optimization of the main techniques in each layer, such as the optimization of Hadoop and HBase, the programming model of MapReduce and the technique details and clustering algorithm implementation of Mahout. After that, based on the combination of customer value hierarchy model (CVH) and neural network classification model, we proposed a distributed brand value model and deployed it on application layer. The overall system architecture research and implementation have also been improved. Finally, the article summarized the contents of each layer and described the parts that may need improvement. Also, the future research directions were discussed. |