Big Language Models help calculate social science iteration
March 08, 2024 11:48 Source: "China Social Sciences" March 8, 2024 Issue 2848 Author: Gong Weigang

Data determines fate,Algorithm changes the future。Chatgpt、Gemini's big language model is born in the air,and the trend of the wind and clouds sweeping the world,Treatment of changes in the paradigm of industrial and scientific research。This fully illustrates the huge potential of big data,Also vividly explains the huge power of algorithm to change history。The foundation generated by the big language model is big data,Let the data speak and make the data into a model,is algorithm and cloud computing,Especially the astronomical overall computing power provided by cloud computing allows big data to produce "nuclear fusion"。

So,In this historical context,What is the new situation of computing social sciences based on big data development? How to be in such an important historical juncture,Promoting China's computing social science development,and fully release the empowerment effect of the new technological revolution on social scientific research? This article attempts to do some analysis on the above two questions。

  bet365 Play online games

Calculating social science based on big data development is generally proposed 10 years ago,and as an academic concept is widely popular。The vision of this concept of the academic community at that time was,With the digital transformation of human society,Big data in various fields is endless,provides unprecedented opportunities for social science research and observation of social systems and human history,Social science research also faces a change from transformation and upgrading from small data paradigms to big data paradigm。but but,Time to move,10 years later,Computing social sciences does not have the prosperity of the prosperity of people,The important academic achievements in calculating the field of social sciences are still rare morning stars,Bet365 lotto review People do not see the spring of hope brought by big data to the Institute of Social Sciences。

Why does this happen? The core reason is the ability limit of the early artificial intelligence algorithm used in the field of social sciences。Big data provides a wealth of minerals for observing social phenomena,But the early artificial intelligence algorithm's excavation ability for these "minerals" is very limited,or,Facing the vast data ocean,Researchers can only look at the ocean and sigh,What I can do is to do some comparison analysis and development of big data,and then greatly limited the release of calculating the potential of social sciences。

The logic in which is,Big data is often non -structured,More than 95%of our data on our planet is in text、Image、Video and audio is in the form of expression,and huge volume,GB、TB、PB or even larger data scale。Before the rise of the big language model,The method of developing big data for researchers is the statistical algorithm and relatively simple machine learning algorithm formed by the sampling era。

Take the excavation of text big data as an example,To analyze the big data of the text,Generally, physical extraction is required、Emotional calculation、Four major tasks of text cluster and text classification。In the past 10 years,Rely on word frequency statistics、TF-IDF and other algorithms to "Far View" text data,Related studies such as Scott A. Golder (2011) and others on human emotions; through regular expression、Stanford-NLP and other algorithms,Related studies such as Maximilian Schich (2014) and others on the development of human cultural development; emotional calculation by labeling data and machine learning algorithm training models;、Word2vec models (Word2vec) and BERT algorithm, etc. embedded text representation and digital conversion,With the help of the K-MEANS ++ algorithm and the LDA algorithm, the theme of the text,Related studies such as Austin C. Kozlowski (2019) and others use the changes in the meaning of the layer of the Google Books corpus to analyze;、Bert algorithm, etc. bet365 Play online games to vectorize the text,With CNN、RNN and other algorithm training models for semantic classification of text,If Gong Weigang (2019) and others use the Bert algorithm to semantically gather in the news in the news。These algorithms can make some excavation in the superficial layer of the text data,But after all, there is still a long distance from the true understanding of the text of the text。Taking word bag models and bert models as examples,The former is transformed by vectorization of text in the situation where the words appear.,Although the latter examines the relative relationship of the word in the context,but not really realizing the semantic understanding of the text,The machine learning model relying on these algorithms is also facing important issues such as insufficient migration ability。Another,These algorithms require high ability to write computer code for researchers,This makes the computing social science method make the humanistic scientific scholar in fear。

It is precisely because of the lack of early algorithm capabilities,Limited to calculate social sciences to realize its value,As the buzzword states: We are on the big data of Jinshan,But I can only dig coal。Calculate social sciences due to algorithm capabilities,Promoting slowly in the past,even lingering for a long time。

But all this is becoming the past。With the rise of the big language model,Big data analysis quickly transformed and upgraded from the "Cold Weapon Age" to the "Hot weapon era",Calculating new changes that are changing in the field of social sciences,The method of calculating social science research is becoming increasingly mature。

This is because the big language model itself is the product of big data+computing power+algorithm,Training data of large language models represented by GPT4 and Gemini, etc.,The most important thing is,These models not only master the knowledge of astronomical order,And has strong semantic understanding ability and logical reasoning ability。This makes it,In turn, with the help of these large language models, the development of big data becomes extremely bet365 best casino games effective and convenient for big data。On the one hand,With the help of large language models,Whether it is a text,Still to the image、Non -structured data such as video and audio for processing,It becomes simple and feasible; on the other hand,Compared to computing early social science algorithms,Processing non -structured data through a large language model,Not only is the operation simple,and very convenient,Researchers no longer have to train machine learning models by labeling a lot of data,No need to write code by mastering complex programming ability。In short,With a big language model,Development of big data has become a writing Prompt and fine -tuning the big language model,Then greatly reduced the threshold to enter the field of big data development,Bring the democratization of the algorithm,and also greatly improved the efficiency of big data development。

still take text big data analysis as an example,Four major tasks of text big data analysis,The first three major tasks can be completed with a perfect language model with the help of the big language model,and the fourth largest task,With the help of embedding models and deep learning models, it can also be easily solved。In short,With the help of generating models、Embedded model and deep learning models:,The four major tasks of the analysis of text big data can be solved almost perfectly,The skills of text big data analysis have also gained thousands of times of growth,Image、Analysis tasks of non -structural big data such as videos,You can get convenient and effective processing through three major models。

The subversive change of the algorithm has changed the big data from the sleeping resources to the Jinshan discovered by science,The vast data ocean also becomes a paradise that can be explored freely because of the change of the algorithm。The most important thing is,The emergence of the big language model,Bring the democratization of the algorithm,Big Data Analysis and Development is no longer a patent for a few scholars who master programming ability,bet365 best casino games Instead, it becomes an important tool that can empower all humanities and social science researchers。

To sum up,Combination of big data and large language models,Let big data analysis -based computing social science research methods and systems gradually mature,Data Intelligence formed by Big Data+Big Language Model,It will also have a profound impact on the paradigm of social science research。Observations for social systems and human history based on big data+large language models,Social science research will inevitably be promoted to the new era discovered by the endless social science laws,Social Science Research is also about to enter its "Cair Popler era"。

  Path and strategy to promote the development of social sciences

So,For the Chinese social science community,How to seize the opportunity of this wave of new technologies,To promote the development of Chinese social science,What are the academic competitiveness globally? This article attempts to build from the perspective of the construction of the new liberal arts laboratory,Talk about this problem。

I think,The development of computing social science research is inseparable from the support of computing the social science laboratory,Correspondingly,Computing the Social Science Laboratory should also become an important area for the construction of the New Literature Lab,or,New Liberal Arts Labs should be constructed mainly in the form of computing the social science laboratory。Build a large number of computing social science laboratories,Promote Chinese social science research to the platform era。So,Why calculate the progress of social science,Should I advance in the form of calculating the social science laboratory?

Let's talk about the content and form of computing the social science laboratory。Calculate the Social Science Lab,Big Data Analysis Lab that integrates big data+cloud computing and algorithms,The laboratory is deployed on the clouds (public cloud and private cloud),Big data stores on the cloud computing platform,Algorithms represented by large language models are deployed in the cloud,Cloud computing platform bet365 live casino games provides storage space for big data,Provide composition for the analysis of big data,The big model deployed on the cloud computing platform provides algorithms for big data analysis。

The summary of the concept of computing the social science laboratory through above,Believe that readers have understood most,Why computing social science must be promoted by a laboratory based on a cloud computing platform。This is mainly related to the characteristics of big data,The prominent feature of big data is the huge volume、Diverse types,And data growth at high speed。To achieve the development of big data,Only the computing power and algorithm provided by the cloud computing platform。Take the global event database (GDELT) commonly used in the field of social sciences as an example,Its volume is as high as 15TB,Global news data,At the same time, with a variety of algorithms, these non -structured news text content is extracted and converted。Obviously,Our own computer is no longer possible to store and analyze these data,Traditional social scientific research methods cannot achieve development of these data,Data storage and analysis must use the cloud computing platform,Cloud platform makes big data analysis simple and easy to go,and let academic research take a train with technological changes in the industry。At this point,Public clouds are too strong than researchers' own servers。

Of course,From the perspective of national strategy,Some important data is not suitable for the public cloud platform that is included in the industry,Therefore, establish an autonomous controllable、Data Security Social Sciences Public Cloud Platform,and ensure that algorithms on the cloud platform can keep pace with the times,It should become an important part of the construction of digital China in the future。

In short,,Calculate the Social Science Laboratory through the integration of data elements、Computing power elements and algorithm elements,Maximize the effectiveness of big data perspective social complex systems and human behavior laws to maximize,Then promote the transformation bet365 live casino games of the social science research paradigm,and usher in a new era of social science research。Promoting the development of computing social sciences,Relying on the construction of computing the social science laboratory,Computing the Social Science Laboratory is integrated big data、cloud computing and various algorithms represented by large language models。

 (Author is an associate professor at Wuhan University Social College、Researcher at Big Data Research Institute of Wuhan University)

​​Editor in charge: Zhang Jing
QR code icons 2.jpg
Key recommendation
The latest article
Graphics
bet365 live casino games
Video

Friendship link:

Website filing number: Jinggong.com Anmi 11010502030146 Ministry of Industry and Information Technology:

All rights reserved by China Social Sciences Magazine shall not be reprinted and used without permission

General Editor Email: zzszbj@126.com This website Contact information: 010-85886809 Address: 11-12, Building 1, Building 1, No. 15, Guanghua Road, Chaoyang District, Beijing: 100026