研究实体

统计学与大数据技术中心

日期：2021-01-22 点击：

中心定位：统计学与大数据技术中心聚焦于大数据基础算法、大数据分析与处理核心算法、人工智能与机器学习核心技术、大数据产品研发技术的研究，以国家大数据战略需求为导向，聚焦于国家经济建设和社会发展的重点领域，加强与行业应用的结合及企业的协同创新，研发大数据与人工智能产业所急需的核心技术，促进技术成果转化和产业孵化。通过承担国家重大科研项目和社会项目，形成大数据分析与人工智能技术研发的持续创新实力，为我国大数据与人工智能战略的实施做出重要贡献。目前主要人员构成如下：

Positioning:The Research Center for Statistics and Big Data Technology is committed to the researches of fundamental algorithms for big dada, core algorithms for big data analysis and processing, core AI and machine learning technologies, and big data product R&D technologies. In response to the big data strategic needs of China, focusing on the key areas of economic construction and social development, the Center investigates and develops core technologies urgently needed by the big data and AI industry via enhanced integration with industry applications and collaborative innovation with enterprises, and promotes the transformation of technological achievements and industrial incubation. The Center is capable of continuous innovation in the R&D of big data analysis and AI technologies by undertaking major national scientific research projects and social projects, making contributions to implementing China's big data and AI strategies. The main members of the Center are listed below:

代表性成果：长期以来，数据挖掘技术是以数据分布和产生数据的物理机制为基础而研发。基于“人一眼能看出二维问题的解”的观察，统计与大数据技术中心研究团队提出了“通过解释和模拟人为什么一眼能看得出的机理进行数据建模”的科学思想，并系统发展了基于视觉认知的数据挖掘新原理与新方法。所提视觉聚类器被评价为“是原创性的研究”“有深刻的数学原理”“做出了多个不平凡的贡献”。世界神经网络协会主席Wunsch在IEEE Trans NN 的综述中高度评价了这一“有趣的分层聚类”方法。视觉分类机通过模拟视皮层特征提取原理和视觉尺度自适应选择机制辨识, 解决了分类算法的模型选择问题, 被认为“解决了支撑向量机所面临的一个重要问题”。相关成果已被广泛用于地理数据分析(美国乔治亚大学Lan小组、路易斯桑那州立大学Wang小组)、图像处理(美国马里兰大学DeMenthon小组)和蛋白质结构分析(比利时那慕尔大学Leherte小组)。特别是Leherte小组长期将该方法用于蛋白质电子密度估计、结构辨识和内硫胺胃蛋白酶配体匹配等。视觉分类机等算法已被山西太原钢铁集团公司用于硅钢纵条纹及热连轧钢板质量控制，带来1100万元/年的直接经济效益。信息融合的“响尾蛇模式”已应用到国家重大工程型号，显著提高了跟踪目标航迹估计精度。研究成果获2011年国家科技进步二等奖，2019年陕西省科技进步一等奖。

Representative achievements:Data mining technologies are generally designed based on data distribution and the physical generalization mechanism underlying data. Based on the observation that "people can see the solution to a two-dimensional problem through an easy observation", the research team of Research Center for Statistics and Big Data Technology proposed the scientific idea of "Building data model by explaining and simulating the mechanism with which people can see the solution through observation", and has developed new principles and methods of data mining based on visual cognition systematically. The proposed visual cluster method is evaluated as "an original research", " with profound mathematical principles", and "makes several novel contributions." Wunsch, chairman of the International Neural Network Society, spoke highly of this "interesting hierarchical clustering" method in the overview of IEEE Trans. NN. The visual classifier solves the model selection problem of the classification algorithm by simulating the principle of visual cortex feature extraction and the visual scale adaptive selection mechanism recognition, which is considered to "solve an important problem faced by the support vector machines". The related results have been widely used in geographic data analysis (by the Lan group at the University of Georgia and the Wang group at Louis Sauna State University), image processing (by the DeMenthon group at the University of Maryland) and protein structure analysis (by the Leherte group at the University of Namur, Belgium). In particular, the Leherte group has used this method for protein electron density estimation, structure identification, and endothiamin pepsin ligand matching for long time. Algorithms such as the visual classifier have been used for the quality control of longitudinal stripes of silicon steel and hot rolled steel by Taiyuan Iron & Steel (Group) Co.,Ltd., yielding a direct economic benefit of 11 million yuan per year. The "Rattlesnake Mode" for information fusion has been applied to the model of a major national project, significantly improving the accuracy of tracking target trajectories.

研究方向：

Research directions:

统计学与大数据技术中心以国家大数据与人工智能战略需求为导向，聚焦于国家经济建设和社会发展的重点领域，在国家自然科学基金重大项目“大数据的统计基础与分析方法”的支持下，着力于大数据基础算法与核心技术、人工智能与机器学习基础问题与基本模型、大数据与人工智能产品研发技术的研究，加强与行业应用的结合及企业的协同创新，打造校企紧密结合的强劲技术团队，进行建设体制和运行机制的深度革新，建成国际先进的大数据技术研发平台，研发大数据与人工智能产业所急需的核心技术，对我国的大数据产业发展提供实质性支撑。目前聚焦的研究内容有：

In response to the national strategic needs of big data and artificial intelligence, focusing on key areas of national economic construction and social development, and supported by "Statistical basis and analysis methods of big data", a major project of the National Natural Science Foundation of China, the Research Center for Statistics and Big Data Technology is committed to the research of basic algorithms and core technologies for big data, basic problems and basic models of AI and machine learning and technologies for R&D of big data and AI products, enhances the integration with industry applications and the collaborative innovation with enterprises, builds strong technical teams linking schools and enterprises, carries out in-depth innovations in the construction system and operating mechanism, builds an international advanced R&D platform for big data technologies, and develops core technologies urgently needed by the big data and AI industries, providing substantial support for the development of the big data industry in China. The focused research contents include：

大数据统计学基础与分析方法；
The statistical fundations and analysis methods of big data;
大数据分析处理的基础与核心算法；
The fundamental and core algorithms of big data analysis and processing;
针对开放环境的复杂形态大数据机器学习模型与算法研究；
Research on machine learning models and algorithms for big data with complex configurations in practical open environments;
面向视频、图像、高光谱遥感等数据类型的智能分析与处理技术；
Intelligent analysis and processing technology for video, image, hyperspectral remote sensing data and other data types;
典型场景下的大数据应用。
Big data applications in other practical scenarios.