| Network science is a powerful tool to explore many complex systems in real world.The objects or entities in the complex systems are denoted as nodes,and the relations between objects are denoted as edges.Most of the existing work only considers the complex networks consisting of single-typed objects and their relations.However,many of the complex networks are composed of multi-typed objects and their multi-typed interconnected links.The networks consisting of multi-typed objects and their interconnected links are collectively referred to as heterogeneous information networks(HINs).This thesis mainly focuses on the problems on the formation of groups of objects with a specific type in HINs,including(1)the similarity measure between two objects of the same type in general HINs,(2)the profit-maximizing team formation in expert-skill-project HINs,and(3)the social activity group formation in user-activity HINs.All of the existing solutions to the above three kinds of problems have some extent limitations.About the similarity measure in HINs,The existing similarity metrics highly depend on the user-specified meta path or meta-structure.Although specifying a meta-path or metastructure can provide personalized services(the semantic-specific similarity)for users,this may cause the similarity to be sensitive to different meta-paths or meta-structures.In addition,it is quite difficult for non-proficient users to select appropriate meta-paths or metastructures.About the profit-maximizing team formation problem in expert-skill-project HINs,the team yielded by the existing algorithms contains some redundant experts.In addition,these algorithms do not allocate specific teams for the selected projects,and therefore the number of projects that each member in the final team participates in is unknown.About the social activity group formation problem in user-activity HINs,the existing algorithms can not provide diverse choices of activities for users.The main contributions can be summarized as:(1)This thesis proposes two kinds of schematic structures for the HINs: the stratified metastructure(SMS)and the recurrent meta-structure(RecurMS).Both of them can be constructed automatically without user involvement,and contain a lot of meta-paths and metastructures.For the SMS,we firstly define the commuting matrix of the meta-structure,and then combine all the commuting matrices of the meta-paths and meta-structures according to different weights.As a result,we obtain the commuting matrix of the SMS,and the SMSS is defined by the commuting matrix of the SMS.For the RecurMS,we propose an approach to decompose the RecurMS into several recurrent meta-paths and recurrent metatrees in order to decouple the object types,and then define the commuting matrices of the recurrent meta-paths and the recurrent meta-trees.To evaluate the importance of different recurrent meta-paths and recurrent meta-trees,we propose two kinds of weighting strategies:local weighting strategy and global weighting strategy.Finally,the RMSS is defined by the weighted combination of the commuting matrices of the recurrent meta-paths and recurrent meta-trees.It is worth noting that RMSS can also provide users with personalized services by specifying different weights for different recurrent meta-paths and recurrent meta-trees.(2)This thesis studies three kinds of sub-problems under different constraints(no redundancies,specific correspondence between projects and teams and participation constraint).For the non-redundant profit-maximizing team formation problem,this thesis proposes a redundancy-eliminating algorithm Eliminate based on the greedy strategy.Given a candidate team and a set of projects covered by this team,algorithm Eliminate successively selects the expert with the maximum coverage ratio until all of the skills required by these projects are covered.After eliminating the redundancies,the cost of the team must decrease.Some additional experts can be added to the team to finish more projects.To this end,this thesis proposes a team-augmenting algorithm Augment.This algorithm successively selects the project with the maximum selectivity value from the set of unselected projects,and the non-redundant team responsible for this project is then added to the current team.For the profit-maximizing team formation problem with the specific correspondence between projects and teams,this thesis improves the definition of the existing team formation problem in the expert-skill-project heterogeneous tripartite networks by allocating a specific team for each project,and devise a novel team formation algorithm Influence based on the voting strategy and greedy strategy.This algorithm takes a universal project-team map as input,and successively remove the expert with the minimum influence value until the cost of the team is less than the budget.Furthermore,we analyzes the workloads of the experts in the final team after getting the correspondence between projects and teams,and discover that the uneven workload inevitably happens.Thus,this thesis further considers the participationconstrained profit-maximizing team formation problem,and propose a Project-First algorithm(ProjectFirst)and an Expert-Removing algorithm(ERA).These two algorithms make a selection according to the ratio of the profit of the project to the cost of the corresponding non-redundant team.(3)This thesis proposes a novel algorithm for forming social activity group,which can provide diverse activity choices for users.This social activity group formation scheme explicitly distinguish the subjective and objective preference of users to activities.The subjective preference of users to activities can be estimated by an iterative method,and the objective preference of users to activities can be estimated by the Expectation-Maximization(EM)algorithm.Then,a weighted bipartite graph consisting of users and activities is constructed by the objective preference values.As a result,the integer programming is employed to model the social activity organization scheme.This integer programming can be solved by relaxing it to a linear programming.The linear programming maybe have no feasible solutions,we therefore propose an efficient algorithm for forming social activity group(AFSAG)which may violate the soft constraints in some degree. |