Application of Data Mining Technology to the Personalized Learning System of Extracurricular Sports.doc

资源描述

1、1Application of Data Mining Technology to the Personalized Learning System of Extracurricular SportsAbstract. There are various forms and rich contents for extracurricular sports learning activities. How to realize the educational guidance for students in extracurricular sports, solve the digital le

2、arning problems of students in extracurricular sports, and advocate independent and personalized learning through using information technology has become a network learning trend currently. In this paper, the application of the Bayesian classification of data mining technology to solving the learnin

3、g style problems of students in the development of the personalized learning system of extracurricular sports is emphatically studied; the performance of students is analyzed with the Apriori algorithm of association rules; WEB logs are mined with FLAAT algorithm, so as to find out the preference pa

4、ths of students and then provide recommendation. Key words: Personalized Learning; Data Mining; Learning Style; Algorithm Because of the personalized learning need of students, a 2learning system meeting the personalization of students learning needs in multiple learning links is urgently demanded.

5、It should be a “hybrid learning“ method, which has the ability to classify and track the learning conditions of students, excavate the learning model and interest of students, and make many learning methods combined according to the characteristics of students. 1. Study Actuality and the Study Conte

6、nts of This System 1.1 Development Actuality of the Personalized System In China, the current situation of network learning systems can be concluded as follows: most of them provide similar functions, focus on theories, and are in a popular style, but few of them are professional, focus on practice,

7、 and feature personalization. 1.2 Study Contents of This System Personalized learning 1 refers to implementing educational activities based on the personality characteristics of learners for giving full play to the initiative of learners, promoting the all-round, free and harmonious development of s

8、tudents, and further developing and stimulating their personalities and potentials. In this paper, by taking the development of the 3personalized learning system of extracurricular sports as an example, the application of database technology in the personalized system is studied and discussed in det

9、ails, and also different personalized functions are implemented through different algorithms. 2. Determining the Learning Styles of Learners with Classification Technology Learning style 2 refers to the personalized learning way persistently used by learners. Its purpose is to establish a classifica

10、tion model. Usually, there are four classification methods: distance-based classification method, decision-tree classification method, Bayesian classification method, and rule induction method. In this system, Bayesian classification is applied. 2.1 Main Characteristics First, hypothesis probability

11、 can be calculated with domain knowledge and other prior information. Second, all attributes can be classified. Third, the objects of discrete attributes can be analyzed. Fourth, data samples can increase or decrease and thus incremental learning can proceed. Fifth, the expression way using directed

12、 graph is very intuitive, and the relationship between the variables can be expressed with arc. 2.2 Bayesian Theorem and Simple Bayesian Classification 4Method Bayesian theorem is a theorem on the conditional probability and edge probability of random events A and B. P(A|B) = P(AB)/P (B) = P(A)*P (B

13、|A)/P (B) In the above equation, P(A|B) is the possibility for A to occur when B occurs; P(A) is the prior probability or edge probability of A; (A|B) is the conditional probability of A when B is known to occur; P(B|A) is the conditional probability of B when A is known to occur; P(B) is the prior

14、probability or edge probability of B, and also is used as a standardized constant. Therefore, Bayesian theorem can be expressed as follows. Posterior Probability = (Similarity*Prior Probability)/Standardized Constant Simple Bayesian classification method is based on probability. It determines the pr

15、obability of the classes through the calculation of posterior probability. Simple Bayesian Model: Vmap=arg max P (Vj|a1, a2.an) In the above equation, Vj belongs to set V; Vmap is the most possible target value solved from a given example, and a1. an is the attributes of this example. Here, Vmap is

16、the target value with the biggest probability in later calculation, 5and thus is expressed with max. With the application of Bayesian formula to P(Vj | a1, a2.an), Vmap= arg max P(a1,a2.an | Vj ) P(Vj)/P (a1,a2.an) can be solved, because a1.an is defaulted to be independent of each other in simple B

17、ayesian classification method. Thus, Vmap= arg max P(a1,a2.an | Vj) P( Vj) can be obtained. In Bayesian classification, a simple assumption is defaulted that the conditions between attributes are independent of each other if target value is given. Therefore, this assumption is conditioned in a reall

18、y given target value. It is observed that the probability of the joint attributes (a1,a2.an) is just the probability product of each individual attribute: P(a1,a2.an | Vj) = i P(ai| Vj). Through probability calculation, the most possible classification target value is solved from the attributes (a1,

19、a2.an) of the sample data that is to be classified. That is, the conditional probability P(Vj|al,a2,.,an) of each VjV (j=1, 2, m) to this group of attributes, and also the class identification with the biggest conditional probability is output and used as the type of the sample data. 2.3 Application

20、 of Bayesian Classification to the Personalized Learning System In this learning system, the learning styles of learners 6are classified mainly with simple Bayesian classification method. (1) Establishing Sample Data According to the characteristics of physical education science, the following sever

21、al common learning styles are defined: visual orientation (V), language orientation (L) and kinesthetic orientation (K). Each learner is seen as a vector S; each learning record of leaner is used as attribute Ai; the learning style of leaner is used as possible class C; for random classification vec

22、tor S=a1, a2,. , an, it has to know the probability P(Ci|S) distribution of S to each class. P(Ci|S) is solved through Bayesian formula P (Ci|S)=P( S|Ci)P(Ci )/P(S) and the one with the biggest probability is the predicted class of S. (2) Recognizing Learners If a student logins the system, the syst

23、em will first check whether a record about the learning styles of this student has been in the learning style database according to the ID of this student. (3) Preprocessing Data In analysis and processing, the related attributes of the training samples set required by the classification model 7shou

24、ld be obtained, including the video learning frequency (V), text learning frequency (T), activity frequency (A), text learning average scores (TS), video learning average scores (VS), activity scores (AS), and learning style (S). 2.4 Establishing the Learning Style Model According to Bayesian maximu

25、m posteriori criterion, the posterior probability P(Cj| a1,a2,an) for any unknown sample s=a1,a2,an is determined, and the biggest class is determined as the learning style of this sample. The specific steps are as follows: (1) the prior probability P(Ci) of each learning style is solved through cal

26、culation; (2) with respect to the probability P(aj|ci) of each class, all values of each attribute is calculated through the training samples set; (3) the probabilities of unknown sample for three classes are calculated and the class with the biggest probability is selected as the learning style of

27、the student. Tab.2-1: Training samples set The key step of establishing the f learning style classification model is to set up training samples set, as shown in table 2-1. 2.5 Establishing the Algorithm According to the above analysis, the following algorithm is 8designed. Input: Training samples se

28、t is Dmn, and the unknown sample is X m-1 Output: The learning style of the unknown sample is Xs Begin (1) Initializing training set D (2) Calculating the prior probability of each learning style for i do calculating the probability SPi=scounti/n; / end for for each attribute xj do of the sample X c

29、alculating the times countji of each class i to appear in the sample D for each class i do Pji= couji/scounti; end for end for (3) Initialization first, and then selecting the class with the biggest probability as the learning style of the student for each class i do newPi=1; for each attribute xj d

30、o 9newPi= newPi*Pji; end for newPi= newPi*SPi; end for Xs=max arg newPi; End 3. Analyzing the Performance of Students with Association Rule Association rule mining algorithm is a strong association rule finding out the approval rating and the confidence greater than or equal to the minimum approval

31、rating and the minimum confidence given by users. 3.1 Apriori Algorithm Apriori algorithm is a frequent item-set algorithm based on Boolean association rules. Its principle is finding out all frequent item-sets first, and then discovering strong association rules that should meet the minimum approva

32、l rating and the minimum confidence. 3.2 Application of Apriori Algorithm in the Personalized Learning System Through analyzing the knowledge points of the interest levels in basketball club and the corresponding activities 10results, the association rules for the relationship between chapters are m

33、ined. (1) Data reduction To promote students to learn, memorize and mine the object frequent item-sets suitable for data mining with Apriori algorithm, the relation table is necessarily converted to related transaction database first. (2) Generating transaction database and autonomously choosing lea

34、rning contents can be replaced with codes Tab.3-1:Codes The records in the transaction database D are shown in the following table. Tab.3-2: Transaction database (3) Generating frequent item-sets Through the above corresponding abstraction, transaction sets can be known by us. It is assumed that the minimum approval rating is 25%, and all frequent item-sets of D are searched with Apriori algorithm. (4) Generating association rule According to Apriori algorithm, all possible proper subsets of any frequent k-item set are found, and then the confidence of related rules is calculated.

展开阅读全文