Predicting Human Interest: An Application of Artificial Intelligence and Uncertainty Quantification
 Tanveer Ahmed^{1}Email author and
 Abhishek Srivastava^{1}
DOI: 10.1186/s4046701600512
© The Author(s) 2016
Received: 27 July 2016
Accepted: 5 October 2016
Published: 26 October 2016
Abstract
The idea that a machine can numerically estimate the interest of an individual towards any entity (e.g., WhatsApp, Facebook) is fascinating. Interest, however, is a complex human property that cannot be quantified by another person; to have a machinedriven method quantify this unobservable and intangible internal property is challenging. In this paper, we make an attempt to address this issue. We propose a novel approach to estimate this internal state of a human. We formulate the interest prediction problem as a hidden state estimation problem and deduce a solution through Bayesian inference. In doing so, we apply indirect inference rules to estimate interest from activity. Activity as a consequence of interest is computed via a subjectiveobjective weighted approach. We further propose a model for interest by taking inspiration from physics. We use mean reverting stochastic procedures to capture the longterm dynamics of interest. With this perspective, a solution is provided via Monte Carlo simulations. To demonstrate the feasibility of the framework, we develop a webbased prototype and experiment with realworld datasets.
Keywords
Interest prediction Data engineering Interest Uncertainty quantification Artificial intelligenceIntroduction
To numerically estimate interest towards any entity is one of the challenging problems in literature. Through one’s social experience, one has often come across the following question: How “much” are you interested in Facebook, WhatsApp, Twitter (or anything)? In other words, we are asking to quantify interest towards any object in the real world. Simply put, we are asking to find a number for someone’s interest. Indeed, the question is simple and straightforward. However, we know that the answer to this trivially formed question is challenging. To think of this purely as a human being, we ourself cannot precisely answer the question, it is undeniably a challenge to have a machine quantify interest.
Interest is an old topic of research. According to [1], interest is an everyday term used to describe the preference of a person towards realworld objects. It has also been specified that interest is an active propulsive state that is aligned towards real entity, subject, topic, activity, etc. and has a high personal definition [1–4]. We must specify here that there are ample definitions of interest in literature; moreover, the notion spans a broad spectrum with work refining and contextualizing the idea over time, for instance, some authors call it as a mental state [5], an affective state [6], others even mix it with intrinsic motivation [1]. However, in this paper, we work along the definition proposed in [7–10] and call interest as a persistent cognitive and emotional state of mind that makes one to engage, get inspired, sometimes even compels one to take actions towards the object of his/her interest. Interest motivates a person to extend his/her normative capability, thereby reaching the extent of exception, hence, indicating the presence of some very promising feature of the human spirit.
 1)
Interest in its innate form has a tendency to evolve. We have experienced numerous times that interest (in any object) has an inherent characteristic, even a natural propensity, to evolve itself. To exemplify this, a person was highly interested to participate at Facebook, for example, but the desire to engage in the daily activities decreased over time (for any reasons). Therefore, the issue here is How to model the evolving dynamics of interest especially considering the typical erratic and unpredictable circumstances in one’s routine.
 2)
We know, and it has been verified in literature, that interest in an object makes a person put in extra efforts and take actions. Naturally, these actions are visible in the form of activity [18–21]. However, an appropriate question in this context is How to measure activity. If we take a close look at the term (activity), we realize that it is an abstract concept. A person always expresses his/her actions through multiple viewpoints. That is, activity spans several perspectives that clearly specify that the actions, stimulated by interest, are not limited to a single dimension. Therefore, how to transform the idea of activity, an abstract term having a wide array of dimensions, into a computationally operable construct?
 3)
How to model the transformation of interest into activity? It was discussed in the previous point that interest results into actions. However, the transformation dynamics of interest into activity is unknown. In other words, we do not have a statistical procedure that can model the transformation of interest into activity.

The problem to estimate interest is formulated as a hidden state estimation problem and a solution deduced via Bayesian inference. We use principles of uncertainty quantification and machineoriented procedures to infer the numerical value of interest indirectly from activity.

To provide a computationally feasible method to calculate activity, we use a subjectiveobjective weighted approach. We combine several different perspectives of activity into a discreet and computationally acceptable construct.

We draw inspiration from physics and propose a squarerootbased mean reverting stochastic procedure to model the dynamics of interest.

We use a regression model to dynamically transform interest into activity.

We combine the contribution of the previous points and present a solution via Monte Carlo simulations. We use particle filter to provide a computationally viable solution to the interest prediction problem.

To demonstrate the viability of the model in real scenarios, we perform experiments on real datasets. With numerical simulations performed on the Stack Overflow databases, we show that modelbased procedures are a good way to estimate variables that are not directly observable.

To validate the proposed framework in practice, we develop a prototype. We implement the framework as a web service and deploy it on a web server. The prototype is developed using RESTful architecture, thereby providing a uniform interface to access the method by any remote or local application.
We must point out that the idea of interest is broad and that the concept has manifold interpretation (theoretical) in literature. Therefore, to study and correspondingly analyze the property of interest via machines, as well as the notion, can be imprecise. However, it captures the phenomenon of practical importance. Interest in any object makes a person engage and put in extra efforts, consequently, there is activity [18–21]. In this context, and drawing on existing terms in artificial intelligence, e.g., [6, 11, 15, 17], the motive of the paper is propose the use of automatic methods to quantify the phenomenon that provokes activity. In this context, the authors of [22] have specified “As a branch of the science of “Big Data”, the field of humaninterest dynamics is at its infancy”. We base the motivation of this paper along these lines and try to complement work in literature by offering it a possible roadmap to study and analyze interest and other related properties through machinebased procedures.
The rest of the paper is organized as follows: In the “Introduction” section, we discuss the main body of work and present the proposed model. Subsequently, we present the results in the “Results” section. As we are trying to predict the internal state of a human, therefore, we discuss the limitations of the framework in the “Discussion and Limitations” section. Finally, we conclude with the future work in the “Conclusion and Future Work” section.
Methods
Interest Prediction: The Bayesian Perspective
In this section, we describe the theoretical foundation of the proposed work. The method takes its inspiration from the Bayesian inference. Bayesian statistics is a branch of study that has found its way in many disciplines and is especially well appreciated in cases of linearity and nonlinearity. With respect to the Bayesian inference and considering the case of the interest prediction problem, the goal of the paper is as follows:
where \(\hat {m}_{t}\): \(R^{q}\times R^{v}\rightarrow R^{p}, \hbar _{ok}\) is i.i.d error.
From Eqs. (3) to (5), we now have a theoretical understanding of the problem and its possible solution. The next step is to provide a computationally feasible method. To do that, we need specific definitions for the measurement function, the transformation function, a procedure to calculate activity, and a Bayesian filter. We define each of these components in the following sections. We start with the procedure to calculate activity.
Computationally Measuring Activity
where the attribute \(a_{b}^{\varPhi }\) denotes the bth perspective of activity at time Φ, e.g., if we consider the previous use case (a person is interested in a mobile game), a _{ b } can denote the amount of time spent playing (on any day), a _{ b+1} can denote the number of gaming sessions, and so on.
where \({w_{i}^{A}} \in \) {0, 1} is the weight of the ith attribute a _{ i } and \(\sum _{i=1}^{z} {w_{i}^{A}} = 1\).
It should be noted here that for the function \(\hat {l}\), the attributes (or perspectives) of activity are always context and application dependent. Moreover, they must be considered separately for every object of interest. For instance, consider the scenario where a person is interested in a social networking website (for any reason), the possible perspectives of activity are the following: the number of messages, the number of profiles browsed, the number of times a user logged in, the duration of each login session, etc., whereas when a person is interested in an outdoor sport (e.g., football), the perspectives could be the amount of time spent playing, time spent practicing, time spent learning game strategies, and so on. To generalize this behavior, we can see that we do not have a single, definite, and universal attribute set for every aspect and applicationdependent perspectives of activity. We have to identify the attributes and have to measure activity for every application or entity of interest separately.
After outlining the idea of perspectives, let us proceed to the next issue: how to calculate weights (\({w_{i}^{A}}\)). This is because weights specify the numerical preference of a person towards the available perspectives of activity. In practice, there are a number of factors that influence the choice and alignment of a person towards a particular perspective of activity. We have observed it many times that two people interested in the same object need not show the same preference towards a specific perspective. To exemplify this, consider the case of social networking websites, e.g., Facebook, some people can show bias towards the number of messages. That is, if they engage in chats, it signifies a much higher level of interest. On the other hand, some people could show bias towards the number of profiles browsed, i.e., they like to browse the profiles of other people. Similarly, others spend a long duration of time surfing through their respective walls. Another way of stating and generalizing the idea is human behavior is sensitive to a variety of factors that shape one character. Moreover, one’s choices are mostly, though not always, influenced by one’s subjectivity. Therefore, the method to calculate weights must incorporate the subjective nature of humans. Literature, however, has pointed out that sometimes choices made under subjectivity are not the best [25]. One must therefore also consider the element of objectivity. With respect to this reasoning, we employ a subjectiveobjective approach for allocating weights. We use the formulations presented in [26, 27] for the purpose.
where W ^{ S } is the subjective weight matrix, W ^{ O } is the objective weight matrix, AT ^{Φ} denotes the attribute matrix, and α∈(0,1) is the bias parameter.
Modeling Human Interest
 1.
Interest is stochastic. The motive here is backed by work in analytical psychology where internal human processes are often represented as stochastic procedures, e.g., recognition [28]. We can therefore expect interest to be stochastic. If, however, this is false, then interest is deterministic, and we can predict human behavior at any time. One can see that this is a contradiction. Hence, interest is stochastic.
 2.
Interest does not increase continuously with time. To prove this, let us consider the opposite: interest is an everincreasing continuous function. However, owing to typical erratic and inevitable circumstances in one’s daily routine, the cycle follows an uncertain behavior (for instance, ups and downs). Therefore, using proof by contradiction, interest is not an everincreasing function.
 3.
There is no term called as negative interest. Mathematically speaking, interest can be zero, i.e., not interested at all, or positive, a factor that specifies some degree of interest. The everyday term negative interest imply one dislikes an entity. This statement clearly indicates the absence of interest.
 1.
It is assumed that interest fluctuates around a constant numerical value in the long run. We have observed and experienced many times that when one engages with an entity (e.g., a video game), then interest is usually high in the beginning, but it stabilizes in the long run.
 2.
Interest is assumed to be a diffusion process (A Markov process without jumps).

The equation describes the motion of a particle in space. The movement of the particle follows a random behavior at each interval of time.

λ denotes the speed of the particle.

μ is the longterm mean value. μ corresponds to the point in space where the particle will settle down in the long run (this property is called as mean reversion).

σ is the volatility component that controls the extent of randomness in the particle’s motion.

The term \(\sigma \sqrt {I_{t}}\) avoids the possibility of having negative interest values.
Following these mathematical foundations and properties, we use Eq. (11) to capture the dynamics of interest. The formulation of the process, however, is not complete. We must point it out that we are trying to make the procedure of interest quantification automatic; consequently, we need to go into more detail. In this context, and from Eq. (11), we see that the framework is dependent upon three crucial parameters: λ,μ, and σ. We therefore need a method to estimate their values. This procedure is explained in the following subsection.
Parameter Estimation
In literature, there is huge body of work dedicated to the study of parameter estimation for stochastic differential equations (SDEs). It has been argued several times that if the parameters of the equation are correct, we can get good numerical estimates of the modeled phenomenon [33]. Parameter estimation for SDEs, however, is nontrivial (the results are extensively elaborated upon in literature, e.g., [33]). Nevertheless, we must find closeenough values. In this paper, we use one of the approximation techniques. More specifically, we estimate the parameters of Eq. (11) using the method of least squares [34]. The procedure is elaborated upon in the following.
Using this procedure, we have the estimates \((\hat {\lambda }, \hat {\mu }, \hat {\sigma })\) of (λ,μ,σ), hence, we have a datadriven statistical method that can effectively model interest.
Interest Resulting Into Activity
where \(\bar {I} = \frac {\sum I_{i}}{n}\), \(\bar {A} = \frac {\sum A_{i}}{n}\).
From the procedure discussed in this section, we have the definition for the measurement function. Hence, we have a modelbased structure that can estimate interest. However, before we can solve the interest quantification problem, we need one more component. In this context, we direct the attention to the “Interest Prediction: The Bayesian Perspective” section and Bayesian inference. The Bayesian inference problems rely upon three ingredients: (1) the measurement function, (2) the transformation function, and (3) A Bayesian filter. Until now, we have the definition of the transformation function (Eq. (11)) and the measurement function (Eq. (23)). We need a Bayesian filter. For this purpose, we use Monte Carlo simulations. Specifically, we employ particle filters.
Particle Filters
Particle filters (PFs) are probabilistic algorithms that are frequently encountered in uncertainty quantification and Bayesian inference problems. They are a member of the Monte Carlo class of simulations. PFs have been found to be highly efficient in cases where the underlying structure of the model is not accurate [36]. This property is especially beneficial for the interest quantification problem as precise evolutionary dynamics of the model are unknown. PFs target highdensity areas of the interest space to compute close numerical values. The algorithm represents the posterior by a group of particles and a set of associated weights. As the state space increases, the particles converge to the approximate posterior density, thereby producing good numerical estimates of interest. For this purpose, the system is provided a set of particles, Pr, at time t, Pr = (\({\eta _{t}^{x}}\), \({w_{t}^{x}}\)). Here, x={1,2,..Z}, Z = the number of particles, \({\eta _{t}^{i}}\) represents a numerical hypothesis for interest, and \({w_{t}^{i}}\) is the weight (or the importance factor) of the xth hypothesis with \(\sum _{x=1}^{Z} {w_{t}^{x}} =1\). The weights are chosen via importance sampling (step 5). PFs propagate each probable estimate sequentially and support every hypothesis using the importance factor, thereby providing good approximate values for interest. For reasons of brevity, the exact procedure of the particle filter with encoded definitions of the transformation function and the measurement function is summarized in Algorithm 1.
Gaps in Activity
To provide a computationally feasible solution from the above representation, we use Eq. (11) during activity gaps. To put the idea in simple words, when we face the situation of the activity gap, the system automatically evolves interest in the next interval of time. For instance, in the previous use case (where one is interested in playing football), the system automatically uses Eq. (11) on the day the person is not able to engage. In this case, note, we can predict the interest’s value, but we cannot update it. However, once new information (about activity) is available, we use Eqs. (11) and (23) to predict and update the interest value using particle filter. Thus, we have a method similar to the continuous time model of interest. This type of modeling is beneficial as we mathematically expect any internal human state to be a continuous time function.
Results
Data Collection and Experimental Setup
To validate the viability of the proposed theory in practice, we experiment with real datasets. More specifically, we use the datasets provided by StackOverflow^{1}. This platform is a mature and a highly respected Q&A discussion forum on the Internet. Further, it has one of the largest public data repositories. Owing to these characteristics, it has attracted a good amount of attention in literature. Work has found that the users of StackOverflow are addicted to participate in its daily activities [37, 38]. Therefore, this online platform presents an excellent opportunity to test the feasibility of the model in real scenarios. In this regard, and to test the model, we collected the granular details of 250 users online^{2}. This was done on a daily basis for one whole year. As the data was collected on a daybyday basis, therefore, interest was also estimated daily. For the purpose of discretization (of SDEs), we use the EulerMaruyama method [35].
In this paper, we have deduced interest from activity. Therefore, the first step in the experimental setup is to calculate activity. Recall that we discussed in the “Computationally Measuring Activity” section that activity depended upon several attributes (or perspectives). In this regard, and for the purpose of experimentation, we collected the following attributes: (1) the number of comments, (2) the number of answers, (3) the number of questions, (4) the number of edits, and (5) the time to answer a question. Owing to reasons of privacy, we could not include more attributes. Nevertheless, with these numerical attributes, the procedure to calculate activity is elaborated upon in the following points.
Activity Calculation
 1.
We fed the system a pairwise comparison matrix to calculate the value of the weights. This was done by following the method discussed in [26, 27]. The subjectiveobjective weight matrices obtained after applying the procedure are presented in Table 1.
 2.
To explain the procedure of activity calculation in detail, an example is discussed in Table 2. In this table, we have presented the case of one random user. Further, and for illustration purposes, we have shown activity calculation for 7 days only (one can generalize the method for any number of days). It was specified in the above paragraph that we collected five different attributes from StackOverflow. They are represented under the column: AT1, …, AT5. From this data, we then normalized the attributes between 1 and 10. The normalized attributes are highlighted under the column: NT1, …, NT5. The matrix containing these normalized attributes is called as the attribute matrix. Recall that to calculate activity via Eq. (8), we need the attribute matrix, the subjectiveobjective matrices, and the bias parameter.
 3.
From steps 1 and 2, we obtained the attribute matrix and the subjectiveobjective weight matrices. Further, with the bias parameter (α) as specified in the table, we used Eq. (8) to calculate the numerical values of activity. The resulting activity vector is shown in the table under the heading activity. An example for day 1 is also explained in the table.
 4.
Steps 1–3 were followed on a daily basis for 250 users, thereby the data consisted of 250 activity vectors. The dataset for numerical activity is made public and can be found at https://drive.google.com/file/d/0BevB9aVUINtbm1YNlZnOVMtVmc/ view?usp=sharing.
Subjectiveobjective weights for the experiment
Attribute  Subjective  Objective 

Answers  0.1549  0.5941 
Questions  0.1333  0.1166 
Comments  0.2127  0.0916 
Edits  0.1944  0.0536 
Time to answer  0.3047  0.1441 
Activity calculation
AT1  AT2  AT3  AT4  AT5  NT1  NT2  NT3  NT4  NT5  Activity  

Day 1  1  1  1  1  46  1  1  1  1  1.003  1.0005 
Day 2  1  1  1  1  15  1  1  1  1  1  0.999 
Day 3  1  1  1  1  236  1  1  1  1  1.0211  1.0043 
Day 4  1  1  1  1  156  1  1  1  1  1.0138  1.0027 
Day 5  5  1  1  2  91,920.2  10  1  1  10  10  7.63 
Day 6  4  2  1  1  32.5  7.75  10  1  1  1.001  4.933 
Day 7  3  1  2  1  113  5.5  1  10  1  1.009  4.1451 
Once we obtained the activity vector, the subsequent step was to predict interest. The following points summarize the procedure to obtain the numerical interest values.
Interest Estimation

After calculating activity, we coded the definitions of the transformation function (Eq. (11)) and the measurement function (Eq. (23)) into the particle filter. In terms of Bayesian statistics, the Bayesian filter was encoded with the state model and the output model.

Once the definitions of the two functions were encoded, the subsequent step was to feed the particle filter the input data for activity. The data for activity was obtained by following the procedure described under the “Activity Calculation” section.

Lastly, we used Algorithm 1 to predict the numerical interest vector for all the 250 users. In other words, using basic rules of the Monte Carlo simulations, we obtained 250 interest vectors.
As is commonly known in Bayesian inference problems, the method is not only able to estimate interest, but it can also predict activity. Therefore, based on the predicted value of activity and the actual activity available to the system, we evaluate the performance of the model. We have chosen the traditional RMSE & MAE as the error metrics. The procedure to obtain the error values is explained in the following points.
Procedure to Obtain RMSE and MAE
 1.
From the procedure discussed under the heading interest estimation, we obtained 250 interest vectors as well as 250 “predicted” activity vectors. Further, from the procedure explained under the “Activity Calculation” section, the system had 250 “actual” activity vectors. Therefore, using the basic rules of error calculation, we computed the RMSE and MAE for every user separately.
 2.
From the previous step, we obtained 250 RMSE and MAE values (one for each user). We then took the average of all the 250 RMSE and MAE values, thereby we obtained only one numerical value for RMSE and MAE.
 3.
We repeated steps 1 and 2 50 times. As a result, we obtained 50 numerical RMSE and MAE values.
 4.
We then took the average of the 50 numerical error values obtained from steps 1–3 and obtained a single number. We present this value in the paper. It represents an overall measure of the predictive capability of the system.
Prototype Development
Model Analysis
Comparison with Similar Procedures
In this section, we test the performance of the proposed model for interest. More specifically, we test the feasibility of the square rootbased meanreverting stochastic differential equation (Eq. (11)) to capture the dynamics of interest. We have compared the performance of the framework with some of the widely followed procedures in literature. We compare the performance with random walk (RW), geometric Brownian motion (GBM), and the OrnsteinUhlenbeck (OU) process [32].
Comparison of the proposed model with similar procedures
MAE  RMSE  

Random walk  1.6702465926  2.763589123 
Geometric Brownian motion  1.9440715077  5.025938536 
OrnsteinUhlenbeck process  1.6006146649  1.9056068 
Proposed framework  1.46330893  1.78828697 
Impact of Varying the Parameters of Particle filters
Accuracy and execution time for different iterations. Number of particle = 20
Number of iterations  MAE  RMSE  Execution time (ms) 

10  1.8205927612  2.0737191934  26,017 
20  1.4398928187  1.7549857745  50,180 
30  1.1140898629  1.5255709094  70,207 
40  0.8602689489  1.3382325218  89,795 
50  0.6233805855  1.2421697673  103,734 
100  0.2821038912  1.0196928238  178,232 
Accuracy and execution time for different numbers of particles. Iteration = 10
Number of particles  MAE  RMSE  Execution time (ms) 

10  1.8206345202  2.0835584618  12,818 
20  1.8198054154  2.0735584618  24,876 
30  1.8161826225  2.0460381487  31,761 
40  1.8139458681  2.0417554565  38,753 
50  1.8132130837  2.0382101465  45,859 
100  1.8130539885  2.0361421347  86,864 
Discussion and Limitations
 1.
Theoretically speaking, the method can quantify interest from measurable activity. However, from a practical point of view, it is not universal. The proposed model will fail in situations when the system cannot measure activity. The prevailing technology is not advanced enough to observe activity for every possible application and object of interest, for instance, we cannot measure activity in the case when a person is interested in reading books. In this scenario, the system is incapable of recording activity; therefore, we do not have data. The presence of data is imperative for the proper working of the system. Hence, for this and similar situations, we cannot estimate interest.
 2.
Connected to the previous problem is the situation where a person is interested in an entity but has not taken any steps to express interest. We understand that interest is an intangible mental variable that we are trying to model via computational approaches. Therefore, we have to face several mechanistic realities. That is, if we expect to estimate interest via machines, then data must be fed to an algorithm. If interest is only considered as an inner feeling, as something that can be expressed without any perceptible medium (or media), then we cannot expect a computational agent to quantify this construct. In this regard, and similar to the previous point, we need data to support the operations of the system. The absence of which leaves the system incapable of working.
 3.One can deduce that through the procedure employed in this paper, we get a number for interest. But a good question is What does the number imply. This question has two points of view.

First, comparing the interest of users. It should be noted here that there is no criterion to compare the interest of people on a common scale. For example, a user has an interest of 0.1, whereas another one has an interest of 0.4 (towards a common entity). This does not imply that the second user is more interested in the entity of common interest. We cannot compare the interest of one individual with the interest of another by weighing them both on the same scale. The rationale here is backed by work in psychology. To explain the idea, we quote a few words: Interest is an active propulsive state that is aligned towards real objects, and has a high personal definition [4]. In the context of these lines, we must specify that interest has a “personal” meaning. In simple words, every user has his/her own way to express interest. For the interest quantification problem, we must not make the mistake of defining a common criterion to measure interest.

The second point of view is How to devise an algorithm that can understand or feel the number. This question is rather tough to answer. Although, literature in affective computing have devised several methods for emotion quantification, but an accurate computational method that can understand/feel human like emotions, or any other internal mental state, is tough to implement. Nevertheless, we must point out that we have not tried to answer this question. We have tried to find the number.

 4.The last problem is the mathematical framework.

We have modeled interest using a square rootbased meanreverting stochastic differential equation. Meanreverting stochastic differential equations are employed in a plethora of work that deals with uncertainty, e.g., [43–46]. However, it is not claimed here that the proposed model is accurate. Interest quantification is a challenging task. We need more efforts. Moreover, efforts need not be limited to meanreverting procedures.

The second problem is parameter estimation. Accurately predicting parameters of SDEs is a standing problem in literature. Work has specified: if the parameters are precise, we can obtain good approximates of the underlying phenomenon [39]. However, parameter estimation is nontrivial. We therefore employed one of the approximation techniques. Consequently, the method suffers from performance issues. Nevertheless, the numbers presented in the “Results” section show acceptable performance.

We modeled the transformation of interest into activity through a regression model. To back this rationale, we quote a few words: “Existing computational assume a positive correlation between stimulation and curiosity” [47]. In this paper, we worked along the idea behind these lines. Although it is acceptable that interest and activity can have a positive correlation, but we do not claim that this idea is universal. One has to understand that similar to modeling interest, engineering a statistical procedure to transform interest into activity is also nontrivial. We need more efforts to statistically understand the way interest transforms into activity.

Conclusion and Future Work
In this paper, we proposed a method to model and correspondingly estimate interest using statistical procedures. Interest prediction problem was formulated as a hidden state estimation problem, and a solution was provided via Bayesian inference. Activity was calculated via a subjectiveobjective weighted approach. Subsequently, indirect inference rules were employed to infer numerical estimates of interest from activity. A model for interest was proposed by drawing inspiration from physics. Interest was modeled as a square rootbased meanreverting stochastic procedure. Particle filter was employed to provide a computationally feasible solution to the problem. A prototype was developed and experimentation was performed on real datasets. Through numerical investigation, it was found that the procedure showed acceptable performance. Several limitations of the proposed method were discussed in detail.
The work presented in this paper just scratched the surface on estimating and modeling interest. For the future work, we need to improve the accuracy of the model. Purely from an engineering point of view, we saw that meanreverting stochastic procedures are a good option to model interest. However, we need to test the mathematical aspects of the theory in detail. Moreover, we need to look into advanced procedure to model the evolution of interest.
Endnotes
^{1} http://www.stackoverflow.com
^{2} http://data.stackexchange.com/
Declarations
Authors’ contributions
TA carried out the study in the paper and drafted the first version of the manuscript. TA and AS designed the framework and developed the proposed method together. AS revised the first version of the manuscript. Both authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Schiefele, U: Interest, learning, and motivation. Educ. Psychol. 26(34), 299–323 (1991).View ArticleGoogle Scholar
 Hidi, S, Baird, W: Strategies for increasing textbased interest and students’ recall of expository texts. Read. Res. Q. 23(4), 465–483 (1988).View ArticleGoogle Scholar
 Hidi, S, Baird, W: Interestingness—a neglected variable in discourse processing. Cogn. Sci. 10(2), 179–194 (1986).Google Scholar
 Dewey, J: Interest and effort in education. Houghton Mifflin (1913).
 Hirayama, T, Dodane, JB, Kawashima, H, Matsuyama, T: Estimates of user interest using timing structures between proactive contentdisplay updates and eye movements. IEICE Trans. Inf. Syst. 93(6), 1470–1478 (2010).View ArticleGoogle Scholar
 Schuller, B, Rigoll, G: Recognising interest in conversational speechcomparing bag of frames and suprasegmental features. In: INTERSPEECH, pp. 1999–2002. Curran Associates, Inc., Brighton (2009).Google Scholar
 Hidi, S, Harackiewicz, JM: Motivating the academically unmotivated: a critical issue for the 21st century. Rev. Educ. Res. 70(2), 151–179 (2000).View ArticleGoogle Scholar
 Krapp, A: Interest and human development during adolescence: An educationalpsychological approach. Motivational Psychology of Human Development – Developing Motivation and Motivating Development. 131, 109–128 (2000).View ArticleGoogle Scholar
 Rathunde, K: Undivided and abiding interest: comparisons across studies of talented adolescents and creative adults. Interest Learn: IPN, Kiel, 367–376 (1998).
 Renninger, K: Individual interest and its implications for understanding intrinsic motivation. Educational Psychology. 13, 373–404 (2000).Google Scholar
 Schuller, B, Müller, R, Eyben, F, Gast, J, Hörnler, B, Wöllmer, M, et al: Being bored? Recognising natural interest by extensive audiovisual integration for reallife application. Image Vis. Comput. 27(12), 1760–1774 (2009).View ArticleGoogle Scholar
 Ashraf, AB, Lucey, S, Cohn, JF, Chen, T, Ambadar, Z, Prkachin, KM, et al: The painful face–pain expression recognition using active appearance models. Image Vision Comput. 27(12), 1788–1796 (2009).View ArticleGoogle Scholar
 Batliner, A, Steidl, S, Schuller, B, Seppi, D, Laskowski, K, Vogt, T, et al: Combining efforts for improving automatic classification of emotional user states. In: Proc ISLTC, pp. 240–245. Infornacijska Druzba (Information Society), Ljubljana (2006).Google Scholar
 Gündüz, Ş, Özsu, MT: Recommendation models for user accesses to web pages. In: Artificial Neural Networks and Neural Information ProcessingICANN/ICONIP 2003, pp. 1003–1010. Springer (2003).
 Zhang, Y, Koren, J: Efficient Bayesian hierarchical user modeling for recommendation system. In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 47–54. ACM, Amsterdam (2007).Google Scholar
 Bennett, PN, White, RW, Chu, W, Dumais, ST, Bailey, P, Borisyuk, F, et al: Modeling the impact of shortand longterm behavior on search personalization. In: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, pp. 185–194. ACM, Oregan (2012).Google Scholar
 White, RW, Bailey, P, Chen, L: Predicting user interests from contextual information. In: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pp. 363–370. ACM, Boston (2009).Google Scholar
 Ryan, RM, Deci, EL: Intrinsic and extrinsic motivations: classic definitions and new directions. Contemp. Educ. Psychol. 25(1), 54–67 (2000).View ArticleGoogle Scholar
 Deci, EL, Ryan, RM: Intrinsic motivation and selfdetermination in human behavior. Plenum. 86, New York and London (1985).
 Anderson, RC: Interestingness of children’s reading material. Center for the Study of Reading Technical Report; no. 323. 3, 287–299 (1984).Google Scholar
 Hidi, S: Interest and its contribution as a mental resource for learning. Rev. Educ. Res. 60(4), 549–571 (1990).View ArticleGoogle Scholar
 Zhao, ZD, Yang, Z, Zhang, Z, Zhou, T, Huang, ZG, Lai, YC: Emergence of scaling in humaninterest dynamics. Sci. Rep. Nat. 3 (2013).
 Wang, TC, Lee, HD: Developing a fuzzy TOPSIS approach based on subjective weights and objective weights. Expert Syst. Appl. 36(5), 8980–8985 (2009).MathSciNetView ArticleGoogle Scholar
 Wang, YM, Luo, Y: Integration of correlations with standard deviations for determining attribute weights in multiple attribute decision making. Math. Comput. Model. 51(1), 1–12 (2010).MathSciNetView ArticleMATHGoogle Scholar
 Hastie, R, Dawes, RM: Rational choice in an uncertain world: the psychology of judgment and decision making. Sage (2010).
 Fan, ZP: Complicated multiple attribute decision making: theory and applications. Ph. D. Dissertation, Northeastern University, Shenyang (1996).
 Chu, A, Kalaba, R, Spingarn, K: A comparison of two methods for determining the weights of belonging to fuzzy sets. J. Optim. Theory Appl. 27(4), 531–538 (1979).MathSciNetView ArticleMATHGoogle Scholar
 Ashby, FG: A stochastic version of general recognition theory. J. Math. Psychol. 44(2), 310–329 (2000).View ArticleMATHGoogle Scholar
 Babuska, I, Tempone, R, Zouraris, GE: Galerkin finite element approximations of stochastic elliptic partial differential equations. SIAM J. Numeric. Anal. 42(2), 800–825 (2004).MathSciNetView ArticleMATHGoogle Scholar
 Babuška, I, Nobile, F, Tempone, R: A stochastic collocation method for elliptic partial differential equations with random input data. SIAM J. Numeric. Anal. 45(3), 1005–1034 (2007).MathSciNetView ArticleMATHGoogle Scholar
 Øksendal, B: Stochastic differential equations, pp. 65–84. Springer (2003).
 Uhlenbeck, GE, Ornstein, LS: On the theory of the Brownian motion. Phys. Rev. 36(5), 823 (1930).View ArticleMATHGoogle Scholar
 Phillips, PC, Yu, J: Maximum likelihood and Gaussian estimation of continuous time models in finance. In: Handbook of financial time series, pp. 497–530. Springer (2009).
 Draper, NR, Smith, H: Applied regression analysis. John Wiley & Sons (2014).
 Kloeden, PE, Platen, E, Schurz, H: Numerical solution of SDE through computer experiments. Springer Science & Business Media (2012).
 Arulampalam, MS, Maskell, S, Gordon, N, Clapp, T: A tutorial on particle filters for online nonlinear/nonGaussian Bayesian tracking. Signal Process. IEEE Trans. 50(2), 174–188 (2002).View ArticleGoogle Scholar
 Bosu, A, Corley, CS, Heaton, D, Chatterji, D, Carver, JC, Kraft, NA: Building reputation in StackOverflow: an empirical investigation. In: Proceedings of the 10th Working Conference on Mining Software Repositories, pp. 89–92. IEEE Press, San Francisco (2013).Google Scholar
 MovshovitzAttias, D, MovshovitzAttias, Y, Steenkiste, P, Faloutsos, C: Analysis of the reputation system and user contributions on a question answering website: StackOverflow. In: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 886–893. IEEE, Niagara Falls (2013).Google Scholar
 Phillips, PC: The structural estimation of a stochastic differential equation system. Econometrica J. Econ. Soc. 40(6), 1021–1041 (1972).View ArticleMATHGoogle Scholar
 Ashby, FG: A biased random walk model for two choice reaction times. J. Math. Psychol. 27(3), 277–297 (1983).MathSciNetView ArticleMATHGoogle Scholar
 Diederich, A: Dynamic stochastic models for decision making under time constraints. J. Math. Psychol. 41(3), 260–274 (1997).MathSciNetView ArticleMATHGoogle Scholar
 Nosofsky, RM, Palmeri, TJ: An exemplarbased random walk model of speeded classification. Psychol. Rev. 104(2), 266 (1997).View ArticleGoogle Scholar
 Vasicek, O: An equilibrium characterization of the term structure. J. Financ. Econ. 5(2), 177–188 (1977).View ArticleGoogle Scholar
 Ditlevsen, S, Lansky, P: Estimation of the input parameters in the OrnsteinUhlenbeck neuronal model. 011907. 71(1) (2005).
 Beaulieu, JM, Jhwueng, DC, Boettiger, C, O’Meara, BC: Modeling stabilizing selection: expanding the Ornstein–Uhlenbeck model of adaptive evolution. Evolution. 66(8), 2369–2383 (2012).View ArticleGoogle Scholar
 Benth, FE, Kallsen, J, MeyerBrandis, T: A NonGaussian Ornstein–Uhlenbeck process for electricity spot price modeling and derivatives pricing. Appl. Math. Finance. 14(2), 153–169 (2007).MathSciNetView ArticleMATHGoogle Scholar
 Wu, Q, Miao, C: Curiosity: From psychology to computation. ACM Comput. Surv. (CSUR). 46(2), 18 (2013).View ArticleGoogle Scholar