The second type of edges is the Stock-related-
stock edge. The establishment of a tweet-influences-
stock relationship considers the impact of public
sentiment on the day, while stock-related-stock
relations consider the impact of past historical data on
the future. For each stock node, stock nodes within a
specific history window (after experiments, the
history window selected in this article is 5 days) is
connected to the node (setting self-loops for its own
node), and the closer the historical node is to the
present node, the greater the impact, and the weight
increases accordingly.
2.2.5 Feature Extension by BERT
The characteristics of the original tweet (which has
been retained through the tweet node relationship
when the knowledge graph is established) are: body,
predicted_label, comment_num, retweet_num, like_
num. The last four features have been fully considered
when calculating the knowledge graph weight, while
‘body’, namely the original content of the tweet, has
not been considered. Using BERT, the content of
tweet are disposed through the text vectoring,
resulting in 768 dimensions of tweet features. These
features, after normalization, serve as input for the
subsequent tweet section of the graph convolution
network.
2.2.6 Graph Convolutional Network
The network used in this article is composed by
different heterogeneous graph convolutional neural
models.
First, the graph data will be entered to the GAT
layer. The GAT model used in this layer introduces a
mechanism of multi-head attention, updating the
tweet-influences-stock relationship of the input
through two attention heads. This layer will transfer.
N
(
c,r,l
)
= α∙c+β∙r+γ∙l
(
3
)
W
(
s,c,r,l
)
=
δ
∙ N
(
c,r,l
)
predicted
=B
earish
δ
∙ N
(
c,r,l
)
predicted
=B
ullish
δ∙N
(
c,r,l
)
+ε∙1+N
(
c,r,l
)
predicted
=N
eutral
(
4
)
the characteristics of each tweet node to the stock
node through the edges, and splicing the output of the
two attention heads in a means aggregation way. The
function of this layer is to pass the unrelated tweet
features in the same day to the stock node, and output
the updated stock features. Considering that the
feature dimension of tweet nodes is large, and the
adjacency matrices of the stock nodes are relatively
sparse, this paper conducted experiments on both
GAT model and SAGE model on the model selection
of this layer. The results show that the training result
model obtained using GAT model can converge,
while the model cannot converge using SAGE model.
The reason for this result may be that SAGE reduces
the complexity of the model by sampling nodes.
However, since the characteristics of tweet nodes are
one of the important dimensions of the model, the
whole graph cannot be transmitted through SAGE
network alone, resulting in poor effect.
The output of the previous layer will only contain
the updated stock node features. The output was then
activated and dropped out.
The second convolutional layer is the SAGE layer.
In the input of this layer, for each stock node, its
characteristics have included the tweet features
updated by the first convolutional layer, as well as the
stock-related-stock relationship and weights
constructed above. The reason why SAGE is selected
in this layer is that the SAGE model only transmits
messages to its K-order neighbors. When establishing
the heterogeneous knowledge graph, in order to avoid
interference between stock nodes with a long time
interval, each stock node is only associated with its 5
historical nodes to retain the influence of a specific
time window. The SAGE model selected by this layer
also only updates the first order neighbor information
of each stock node, combined with heterogeneous
knowledge graph, that is, only 5 historical nodes are
transmitted.
The output is activated next. After activation, the
BatchNorm layer is used to normalize the data. The
classification results are finally outputed using
softmax.
3 RESULTS AND DISCUSSION
3.1 Experimental configuration
The training set used in the experiment was 90% of
the original data set, and the test set was 10% of the
original data set. The experimental configurations are
shown in the table 10.