banner



high-frequency trading strategy based on deep neural networks pdf

1 Introduction

Commercial enterprise trading is an online determination-making appendage (Deng et al., 2022). Premature works (Moody and Saffell, 1998; Moody and Saffell, 2001; Dempster and Leemans, 2006) demonstrated the Strengthener Scholarship (RL) broker's promising profitability in trading activities. Even so, traditional RL algorithms face challenges for the intraday trading job in three aspects: 1) Close-term financial movement is much accompanied by more noisy oscillations. 2) The computational complexity for fashioning determination in daily continuous-evaluate price range. In the T + n strategy, RL agents are assigned a lengthened, neutral, or short position in each trading day, including the Indistinct Deep Repeated System Networks (FDRNN) (Deng et al., 2022) and Aim Reinforcement Learning (DRL) (Moody and Saffell, 2001). Withal, in day trade, i.e., T + 0 strategy, the trading project is converted to identify the optimal price to open and close the gild. 3) The early block up of orders when applying the intraday scheme. Conventionally, the colonization of orders involved two hyperparameters: Object Profit (TP) and Stop Loss (SL). TP refers to the price to close the causative order and exclude the profit if the price moved as expected. SL denotes the price to terminate the transaction and avoid a further loss if the monetary value sick towards a loss direction (e.g., the Leontyne Price dropped down following a long lay out decision). These two hyperparameters are defined as a fixed shift proportional to Price to figure the market, as known Eastern Samoa, points. If the price touched these two-preset levels, the order will be sealed deterministically. An instance of the early-break order is shown in Figure 1.

www.frontiersin.org

FIGURE 1. An early-stop loss problem: a short order is early settled (red dash line: SL) in front the Leontyne Price drops to the profitable range. Thus, the strategy loses the potential drop profit (blue double arrow).

Focusing on the mentioned challenges, we proposed a deep reinforcement learning-supported end-to-end learning model, named QF-TraderNet. Our model directly generates the trading insurance to control turn a profit and loss instead of using fixed TP and SL. QF-TraderNet comprises two system networks with polar functions: 1) a Long-short Term Memory (LSTM) networks for extracting the temporal role feature in financial prison term series; 2) a policy generator network (PGN) for generating the statistical distribution of actions (insurance) in each state. We especially acknowledgment the Quantum Price Levels (QPLs) as illustrated in Figure 2 to design the action at law space for the RL agentive role, thus discretizing the price-economic value infinite. Our method acting is inspired by the Quantum Finance Theory that QPLs captures the equilibrium states of price movement on a daily basis (Lee, 2022). We utilize the deep reward learning algorithm to update the trainable parameters of QF-TraderNet iteratively to maximize the cumulative price recall.

www.frontiersin.org

FIGURE 2. Illustration of AUDUSD's QPLs in 3 successive trading days (23/04/2020–27/04/2020) in 30-min K-line graphical record. The blue lines represent negative QPLs based on the ground state (fatal dash line); the red lines are positive QPLs. Line color deepens with the rise of the QPL level n.

Experiments along various financial datasets, including the business indices, metals, crude oil, and FOREX, and comparisons with previous RL and Deciliter-based single-intersection trading systems sustain been conducted. Our QF-TraderNet outperforms some progressive baselines in the profitability evaluated by the cumulative yield and the lay on the line-oriented return (Sharpe ratio), and the hardiness veneer market turbulence. Our model shows adaptability in the unseen market environment. The generated policy of QF-TraderNet also provides an interpretable profit-and-loss order ascendance strategy.

Our main contributions could be summarized as:

• We purpose a novel ending-to-end daytrade mold that directly learns the optimal price index to make up, thus solving the early stop in an implicit stop-loss and fair game-profit setting.

• We are the number 1 to present RL broker's action space via the daily quantum price level, making the machine Clarence Day trade tractable.

• Under the same market information perception, we achieve better profitability and robustness than previous land-of-the-art RL based models.

2 Associated Work

Our work is in line with two sub-tasks: business feature extraction and transactions based along deep reinforcement learning. We shortly reappraisal past studies.

2.1 Commercial enterprise Feature Origin and Representation

Computational approaches for the applications in financial molding have attracted much attention in the historic. (Peralta and Zareei, 2022). utilized the network simulate to perform the portfolio planning and selection. Giudici et al.. (2021) used excitableness spillover decomposition methods to model the dealings between ii currencies. Resta et al. (2020) conducted a technical psychoanalysis-based approach to nam the trading opportunities with specific along cryptocurrency. Among these, the neural networks shows promising power in learning both the integrated and unstructured information. Most of the related works in nervous financial modeling were made to the kinship embedding (Li et al., 2022) and prediction (Wei et al., 2022), choice pricing (Pagnottoni, 2022), and forecasting (Neely et alii., 2022). The foresighted short memory networks (LSTM) (Wei et alii., 2022), Elman perennial neural networks (Wang et atomic number 13., 2022) were employed in financial time series analysis tasks successfully. Tran et al. (2018) utilized the attention mechanism to refine RNN. (Mohan et al., 2022). leveraged both market and textual info to boost the carrying into action of regular prediction. Some studies besides adopted stock embedding to mine the kinship indicators (Chen et al., 2022).

2.2 Reinforcement Learning in Trading

Algorithmic trading has been widely studied in its different subareas, including risk of exposure control (Pichler et al., 2022), portfolio optimization (Giudici et aliae., 2022), and trading strategy (Marques and Gomes, 2010; Vella and Ng, 2022; Chen et al., 2022). Nowadays, the AI-based trading, especially, the reinforcement eruditeness-approach, attracts the interest in both academia and industry. Moody and Saffell (2001) proposed a straight reinforcement algorithm to trade wind and performed a comprehensive comparison between the Q-learning with the insurance policy gradient. Huang et al. (2016) further nominate a robust trading agent based on the low-Q networks (DQN). Deng et aliae. (2016) utilized the fuzzy logic with a deep learning model to extract the financial lineament from noisy time series, which achieved state-of-the-art performance in the single-product trading. Xiong et al. (2018) employed the Deep Settled Policy Slope (DDPG) baesd on the standard actor-critic framework to perform the stock trading. The experiments incontestable their profitability over the baselines including the min-variableness portfolio storage allocation method and the technical approach founded on the Dow-Jones Industrial Average Industrial Average (DJIA) index. Wang et aliae. (2019) employed the RL algorithm to construct the winner and loser portfolio and traded in the buy up-victor-trade-also-ran strategy. However, the intraday trading tax for reinforced trading agent are still fewer addressed, which is chiefly because the complexness in designing trading place for frequent trading scheme. We dominantly aim at the efficient intraday trading in our research.

3 QF-TraderNet

Daytrade refers to the scheme of taking a position and leaving the marketplace within one trading day. We LET our model sends an order when the market is opened every trading day. Supported the observed surround, we train QF-TraderNet to hear the best QPL to conciliat. We will introduce the QPL based action space explore and role model computer architecture separately.

3.1 Quantum Finance Possibility Based Carry out Space Hunt

Quantum finance theory elaborated on the relationship between the lowly financial market and the classical-quantum mechanics model (Lee, 2022) (Meng et alibi., 2022) (Ye and Huang, 2008). QFT proposes an anharmonic oscillator model to embed the interrelationships among financial products. It considers the dynamics of the financial products are plummy by the energy force field generated by itself and other financial product (Lee, 2022). The energy levels generated from the field of particle regulate the equilibrium states of price movement on a daily basis, which is noted as the daily quantum price level off (QPL). QPLs could make up viewed as the support or immunity in classical business enterprise analysis indeed. Past studies (Henry Lee, 2022) have shown that QPLs canful cost used as feature extraction for the business clip serial. The operation of the QPL calculation is given with the following steps.

Step 1: Modeling the Potential Energy of Market Movement via Four Major Market Participants

Same with the classical quantum mechanics, the Hamiltonian in QFT contains the potential term and the volatility terminus. Founded on the conventional financial depth psychology, primary market participants include 1) Investor, 2) Speculator, 3) Arbitrageurs, 4) Tergiversator, and 5) Market shaper; however, there is no addressable accidental for Arb to execute effective trading according to the timesaving market hypothesis (Lee, 2022). Thus we ignore the arbitrageurs' effect, and then count the impact of other participants towards the calculation of food market potency term:

Market makers provide the facilitator services for different participants, and to absorb the outstanding demand noted American Samoa z σ , with absorbability factors α σ . So, the excess exact at whatever instance is given away Δz = z +z . The relationship between instantaneous returns r ( t ) = r ( t , Δ t ) = p ( t ) p ( t Δ t ) p ( t Δ t ) , and the excess ask could be approximately noted as r ( t ) = Δ z γ , in which γ represents the market depth. For an high-octane market with the smooth market environment, we assume the absorbability of present orders with different trading directions will be the same, and the contribution of the securities industry makers is derivable as (Lee side, 2022),

d Δ z d t | M M = d z + d t | M M d z d t | M M ( 1 )

where σ denotes the trading placement including +: long position, and -: short position. r t denotes the coincidental price return respect to meter t.

Speculators are slew-pursuing participants with few senses nigh risk ascendency. Their behavior mainly contributes to the market movement by its active oscillator term. A damping variable δ is defined to represent the resistance of curve followers behaviors towards the marketplace. Considering that speculators give birth inferior consider peril, there is no more high-order anharmonic term regarding the market excitability,

Investors have a sense of stopping loss. They are 1) earning profit following the trend, 2) minimizing the risk; thus, we define their possible energy by,

d Δ z d t | I V = r t δ | I V v | I V r t 2 ( 4 )

where δ, v stand for the in harmony dynamic term (trend favorable contribution); and anharmonic term (market excitability), respectively.

Equivocator also controls the risk but victimisation refined hedge techniques. Usually, the repeal trading direction has been performed aside Hedgers compared with common Investors, especially for the one-product hedging strategy. Hence, the market dynamic caused by Hedger could be summarized as,

d Δ z d t | H G = δ | H G v | H G r t 2 r t ( 5 )

To conclude the equations (3.1) from to (3.4), the synchronal price homecoming dr/dt could be rewritten as,

d r d t = γ i = 1 P d Δ z i d t = γ δ r t + γ v r t 3 ( 6 )

where P denotes the number of types of participants inside markets. δ, and v in Combining weight. 5 are the summary of all term across all participants models, i.e., δ = γα MM + δ SP + δ HG δ IV , and v = v HG v IV . Combining dr/dt with the Brownian price returns represented past the Langevin equality, the instantaneous potential muscularity is sculptural with the following equation,

V r = γ η δ r γ η v r 3 d r γ η δ 2 r 2 γ η v 4 r 4 ( 7 )

where η is the damping drive in agent of the market.

Step 2: Modeling the Kinetic Term of Market Movement via Price Return

Peerless challenge to model the kinetic term is to replace the displacements in classical particles with an appropriate measurement in finance. Specifically, we put back displacement with Mary Leontyne Pric returns r(t), equally r(t) connects the damage change with unit of time, which simplifies the Schrödinger equation into the Non-time-dependent nonpareil. Hence, the Hamiltonian for fiscal particle could be formulated by,

where , m denote the plank constant and intrinsic properties of the financial marketplace, such as market capitalisation in a stock commercialise. Combining the Hamiltonian with the classical Schrödinger equation, the Schrödinger Par for Quantum Finance Theory (QFSE) comes out with (Lee, 2022),

2 m d 2 d r 2 + γ η δ 2 r 2 γ η v 4 r 4 ϕ r = E ϕ r ( 9 )

E denotes the particle's energy levels, which refers to the Quantum Price Levels for the financial particles. The first-year term 2 m d 2 d r 2 is the kinetic energy terminus. The second term V(r) represents the potential energy terminus, i.e. (3.6), of the quantum finance commercialize. ϕ(r) is the wave-function of QFSE, which is approximated by the chance density function of historical price return.

Step 3: Do the Action Space Seek aside Solving the QFSE

According to QFT, if there were no extrinsic incentives such as financial events or the release of evaluative commercial enterprise figures, QFPs would remain at their energy levels (i.e., equilibrium states) and perform regular oscillations. If there is an external stimulus, QFPs would absorb or release the quantity energy and jump to other QPLs. Thus, daily QPLs could comprise viewed as the potential states of the price movements in unmatchable trading twenty-four hour period. Thence, we engage QPLs as the action candidates in the action space A = a 1 , a 2 , , a A of QF-TraderNet. The detailed numerical method for resolution QFSE and the algorithm for the QPL based sue space search is given in the supplementary file.

3.2 Deep Have Learning and Representation by LSTM Networks

LSTM networks picture auspicious performance in the sequential feature learning, American Samoa its structural adaptability (Gers et aliae., 2000). We introduce the LSTM networks to extract the temporal features of the business enterprise series, thus rising the perception in the market status of the policy generation net (PGN).

We use the same look on-back window in (Wang et al., 2022) with size W to split the input sequence x from the completed series S = s 1 , s 2 , , s t , , s T , i.e., agent evaluates the marketplace status by the clock time stop with sized W. Hence, the input matrix of LSTM could be famous as X = x 1 , x 2 , , x t , , x T W + 1 , where x t = s t W + w | w [ 1 , W ] T . We design our stimulant vectors s t is constituted aside: 1) Opening, highest, lowest and closing prices for each trading day. Notice: the close price in t − 1 day mightiness be different with the open price in t because of the adjustment of the market outside the trading hours; thu, we deal the entire Leontyne Price variables with four types. 2) Transaction Volume. 3) Moving Average Convergence-Divergence is a technical index to identify the market status. 4) Relative strength index is a field indicator measuring the price momentum. 5) Bollinger Band (main, upper, and lower) can be practical to identify the potential Mary Leontyne Pric range, consequently observing the market trend (Colby and Meyers, 1988). 6) KDJ (stochastic oscillator) is used in short-term directed trading by the price velocity techniques (Colby and Meyers, 1988).

The principal components analysis (PCA) (Wold et al., 1987) is utilised to compress the serial publication data S into F ̃ dimension and denoise (Wold et al., 1987). After, the L2 normalization is applied to descale the input features to be in the same magnitude. The preprocessing is calculated as,

X ̃ = P C A F F ̃ X P C A F F ̃ X 2 ( 10 )

, where F ̃ danlt; F , and the deep feature learning mannikin could Be represented as,

h t = L S T M ξ x t ̃ , t 0 , T W + 1 ( 11 )

where ξ is the trainable parameters for LSTM.

3.3 Policy Generator Networks (PGN)

Given the learned feature vector h t , PGN straight off produces the output signal policy, i.e., the probability of settling order in each + QPL and -QPL, according to the action score z t i produced past a fully-connected networks (FFBPN).

where θ deontes the parameters of FFBPN, with the weighted matrix W θ and bias b θ . Let a t i denotes iatomic number 90 accomplish at time t. The output insurance a t is calculated as,

a t + = e x p z t i a i 1 , A e x p z t i ( 13 )

in timestep t, model takes action a t by sampling from the policy a t + comprised of eight-day (+) and short (-) trading centering. a t + contains A dimensions, indicating the number of candidate actions, with the reward of price return r t i for each,

r t i = δ Q P L δ i p t o , Q P L δ i p t h , p t l δ p t c p t o , Q P L δ i p t h , p t l ( 14 )

where δ denotes the trading counselling: for actions with +QPL as the target price level to settle, the trading will atomic number 4 determined as long buy (δ = + 1); for the actions in -QPL, short betray (δ = − 1) trading wish be performed; and δ is 0 when the determination is made to comprise neutral, as no trading will exist made in t trading day.

We train our QF-TraderNet with reinforcement acquisition. The key idea is to maintain a loop with the serial steps: 1) agent π aware the environment, 2) π cook the action, and 3) adjust its behavior to receive more reward until the agent has received its encyclopaedism end (Sutton and Barto, 2022). Thus, for each training episode, a trajectory τ = ( h 1 , a 1 ) , ( h 2 , a 1 ) , , ( h t 1 , a T ) could be defined as the sequence of tell-action tuple, with the similar payof sequence1 r = r 1 , r 2 , r 3 , , r T . The probability of action Pr (action t = i) for all QPL is stubborn by QF-TraderNet A:

a t i = P r a c t i o n t = Q P L i | X ̃ ; θ , ξ ( 15 )
= π P G N θ L S T M ξ x t ̃ a c t i o n = i ( 16 )

get R τ denotes the cumulative price return for trajectory τ , with t = 1 T W + 1 r t ( i ) = R τ . Then, for all possible explored trajectories, the expectation reward obtained by the RL agent could be evaluated as (Sutton et aliae., 2000),

where P r ( τ | θ , ξ ) π is the probability for QF-TraderNet agent π with parameters θ and ξ to generate trajectory τ with Monte-Carlo Simulation. Then, the objective is to maximize the expectation of reward, θ *, ξ * = argmax θ,ξ J ( θ,ξ ). We substitute objective with its opposite to and practice slope descent to optimize. To avoid the local minimum probelm caused by the multiple postive-reward actions, we utilization the state-dependent limen method (Sutton and Barto, 2022) to allow the RL broker perform a much efficient optimization. The detailed gradient calculation is conferred in the additive.

3.4 Trading Insurance policy With Learnable Flocculent Profit and Loss Control

In QF-TraderNet, the LSTM networks learn the hidden representation and run over it into PGN; then PGN generates the nonheritable policy to make up one's mind the place QPL to settle. As the action is sampled from the generated policy, QF-TraderNet adopts a soft profit-and-loss contain strategy rather than the deterministic TP and SL. The overall compact of QF-TraderNet architecture has been shown in Figure 3.

www.frontiersin.org

Chassis 3. The RL framework for the QF-TraderNet.

An equal direction to construe with our scheme is that our model trades with long buy if the decision is made in positive QPL. In reverse, shortish sell transactions will be delivered. Once the trading direction is decided, the place QPL with the utmost probability will be well thought out as the soft target Mary Leontyne Pric (S-TP), and the soft diaphragm loss line bequeath be the QPL with the highest chance in the opponent trading way. I typification is presented in Figure 4.

www.frontiersin.org

FIGURE 4. A case study illustrates our profit-and-expiration control strategy. The trading policy is uniformly divided up at the start. Ideally, our model assigns the +3 QPL action which earns the maximum profit with the largest chance as S-TP. On the short side, −1 QPL can take the most considerable reward, leading to being licenced the maximum probability as S-SL.

Since the S-TP and S-SL ascendency is chance-based, when the price touches the stop deprivation occupation untimely, QF-TraderNet testament not be forced to do the settlement. It will think whether there is a better quarry price for settlement in the entire action space. Thus, the model is to a greater extent flexible for the Sendero Luminoso and TP control in antithetic states, compared with using a few preset "herculean" hyperparameters.

4 Experiments

We conduct the observational evaluation for our QF-TraderNet in various types of financial datasets. In our experiment, eight datasets from 4 categories are used, including 1) foreign exchange product: Great Britain Pounds vs. United States Buck (GBPUSD), Australian Clam vs. United States Dollar (AUDUSD), Euro vs. Amalgamated States Dollar (EURUSD), Federated States Dollar vs. Swiss Franc (USDCHF); 2) financial indices: SdanA;P 500 Index (Sdanampere;P500), Bent Seng Forefinger (HSI); 3) Metal: Silver vs. United States dollar (XAGUSD), and 4) Rock oil: Oil vs. US Dollar (OILUSe). The evaluation is conducted from the position of earning profits; and the hardiness when agents face the unexpected change of market states. We also investigate the impact of diverse settings of our proposed QPL based action space search for RL trader, and the cutting out study of our model.

4.1 Experiment Settings

Whol datasets utilized in experiments are fetched from the free and staring historical information center in MetaTrader 4, which is a white-collar trading platform for the FOREX, financial indices, and other securities. We download the unanalyzed time series data, around 2048 trading days, and we divide the 90% front of data for training and establishment. The relaxation will be utilized as out-of-sample confirmation, i.e., the continuous series from November 2012 to July 2022, has been spliced to construct the sequent training sample; the rest part is applied equally testing and validation. To be noticed, the valuation period has crusty the recent fluctuations in the global financial market caused by the COVID-19 pandemic, which could be utilized as the robustness test when the trading agent is treatment the unforeseen grocery store fluctuations. The size of spirit-back windowpane is set at 3, and the metrics regarding price return and Sharpe ratio is day-to-day premeditated. In the backtest, initial capital is set to the related currency or asset with a value of 10,000, at a transaction be with 0.3% (Deng et al., 2022). All the experiments are conducted in the single NVIDIA GTX Titan X GPU.

4.2 Models Settings

To equivalence our model with the traditional methods, we select the forecasting supported trading model and other state-of-the-art reinforcement learning-based trading agents as the baseline.

Market service line (Huang et alia., 2022). This strategy is wont to measure the overall performance of the market during this period T, aside holding the product systematically.

DDR-RNN. Following the idea of Deep Unilateralist Reinforcement, but we apply the principal component analysis (PCA) to denoise and composes data. We also hire RNN to learn the features, and a ii-level FFBPN as the policy generator instead than the logistic regression in original design. This model ass be regarded as the ablation canvas of QF-TraderNet without the QPL action space seek.

FCM, a forecasting model based on RNN trend predictor, consisting of a 7-level LSTM with 512 hidden dimensions. Information technology trades with a Bargain-Winner-Sell-Failure strategy.

RF. Same design with FCM but forebode the trend via Random Forest.

QF-PGN. QF-PGN is the policy gradient based RL agent with QPL based order control. Single FFBPN is utilized as the insurance generator with 3 ReLU layers, and 128 neurons per layer. This sit could be admitted Eastern Samoa our model without the esoteric characteristic representation block.

FDRNN (Deng et alia., 2022). A progressive direct strengthener RL trader following the one-merchandise trading, by using the fuzzy representation and intense autoencoder to extract the features.

We implement two versions of QF-TraderNet: 1) QF-TraderNet Lite (QFTN-L): 2 layers LSTM with 128-dimensional out of sight transmitter as the feature agency, and 3 layers of policy generator network with 128, 64, 32 neurons per from each one. The size of fulfill blank is 3.2) QF-TraderNet Ultra (QFTN-U): Same architecture with the Lite, but the number of candidate actions is enlarged to 7.

Regarding the training settings, the Adjustive Moment Appraisal (Robert Adam) optimizer with 1,500 training epochs is used for all iterative optimization models at a 0.001 eruditeness rate. For the algorithms requiring PCA, the target dimensions F ̃ is set at 4, satisfying the composes matrix has embedded 99.5% of the interrelationship of features. In the serviceable implementation, we directly apply the quaternity prices equally the input for USDCHF, Sdanamp;P500, XAGUSD, and OILUSe; the normalization step is not performed for the HSI and OILUSe. The reason is that our research results show our model rear perceive the market DoS good in these settings. For the sake of computational complexity, we remove the extra input features.

4.3 Performance in 8 Financial Datasets

As displayed in Figure 5 and Hold over 1, we present the rating of each trading system's profitability in 8 datasets, with the prosody of cumulative price return (CPR) and the Sharpe ratio (Sr). The Cardiac resuscitation is developed with,

C P R = 1 t p t h o l d i n g p t s e t t l e m e n t ( 18 )

and the Sharpe ratio is deliberate by:

S R = A v e r a g e C P R S t a n d a r d D e v i a t i o n C P R ( 19 )

www.frontiersin.org

FIGURE 5. 1st panel: Continuous partition for the training and verification data; 2nd panel: Affected by the global social science situation, most datasets showed a descending trend at the examination interval, accompanied by extremely irregular oscillations; the 3rd dialog box: cumulative repay curve for different methods in testing evaluation.

www.frontiersin.org

TABLE 1. Unofficial of the main comparison results among all models.

The consequence of Market denotes that the food market is in a downtrend with high volatility in the evaluating interval, attributable the late planetary scheme fluctuation. The Leontyne Price home in testing is not fully blanketed in training data in some datasets (crude inunct and AUDUSD), which tests the models in an unseen environment. Under these testing conditions, our QFTN-U trained with CPR achieves higher CPR and SR than other comparisons, except the Atomic number 38 in Sdanamp;P500 and Crude Oil. QFTN-L is also same to the baselines. It signifies the profitability and lustiness of our QF-TraderNet.

What is more, QFTN-L, QFTN-U, and the PGN models yield significantly higher Cardiopulmonary resuscitation and Steradian than former RL traders without QPL-based actions (DDR-RNN and FDRNN). The ablation study in Table 2 likewise presents the contribution of to each one component in detail (Supervised counts from the normal of Element 104 and Fcm), where the QPL actions dramatically contribute to the Sharpe Ratio of our full pose. These demonstrates the welfare of trading with QPL to get ahead substantial profitability and expeditious risk-assure ability.

www.frontiersin.org

TABLE 2. Ablation study for QF-TraderNet.

The backtesting results in Table 3 shows the good generalization of the QFTN-U. Information technology is the alone strategy for earning a sure turn a profit on almost every datasets, which is because the solar day-trading strategy are less stage-struck by the market trend, compared with other strategies in long, neutral, and dumpy setting. We also find that the performance of our model in FOREX datasets is importantly wagerer than others. FOREX contains more noise and fluctuations, which indicates the advantages of our models in extremely fluctuated products.

www.frontiersin.org

TABLE 3. Summary for profits in the backtesting.

4.4 QPL-Inspired Intraday Trading Model Analysis

We analyze the conclusion of the QPL-supported intraday models in Table 4 as two classifications: 1) auspicate the optimal QPL to settle; 2) predict the profitable QPL (the QPLs having the same trading direction with the optimal one) to settle. Noticeably, the action space for PGN and QFTN-L is {+1 QPL, Neutral, -1 QPL}, which means that these cardinal classification tasks for them are actually the same. QFTN-7 might have multiple ground truths, as the payoff might be the cookie-cutter piece settlement in varied QPLs, thus we entirely report the accuracy. Table 4 indicates two points: 1) comparison with PGN, our QFTN-L with LSTM as feature origin has high accuracy in the optimal QPL selection. The contribution of LSTM to our model can also be proved in the ablation study in Table 2. 2) QFTN-U has less accuracy in best QPL prediction compared with QFTN-L, attributable the large action space brings difficulties in decision. Nevertheless, QFTN-U earns high CPR and SR. We visualize the advantage in the training process and the actions made in testing as shown in Figure 6. We dissect that the better operation of QFTN-U is receivable to the more accurate assessment of trading direction (visit their accuracy in the trading direction classification). In accession, QFTN-U can search its policy in a broader range. When the agent perceives changes in the market environs confidently, it hind end select the QPL farther than the ground State as the target toll for order closing, rather than only the first empiricist philosophy operating room negative QPL, thereby obtaining more potential payoff, although the action mightiness not be optimal. For illustration, if the price is in a substantial increase, agents acquire higher rewards past closing orders at +3 QPL sooner than the only positive QPL in QFTN-L's prospect decisions. According to Figure 6, the trading directions made by ii QFTNs are usually the unvarying, but QFTN-U tends to blow up the levels of selected QPL to obtain more profit. However, the Ultra theoretical account needs more preparation episodes to converge normally (GBPUSD, EURUSD, and OILUSe, etc.). Additionally, the Lite model suffers from the topical best trap happening some datasets (AUDUSD and HSI), in which our model tends to select the same action consistently, e.g., the Fatless model keeps delivering a short trade with uniform TP mount in the -1 QPL for AUDUSD.

www.frontiersin.org

Remit 4. Decision compartmentalisation metrices.

www.frontiersin.org

Cipher 6. Grooming curves for varied settings in action space size up.

4.5 Flared the Size of Execute Space

In this section, we compare the average CPR and SR among 8 datasets versus different settings of the accomplish space size in Figure 7. We observe that when the size of it of the action space is to a lesser degree 7, multiplicative this parameter has a positive effect on system performance. Especially, Figure 5 shows that our lite model fails in the HSI dataset but the immoderate one achieves strong performance. We argue this is because the larger action space can possibly lead to trading with complex strategies. However, when the enumerate of candidate actions continues to increase, SR and Cardiopulmonary resuscitation decrease afterward A = 7. We analyze as that the action space of the daytrade model should cover the optimum settlement QPL (global ground truth) within the daily damage range ideally. Therefore, if the QPL that brings the maximum pay back is non in the model's military action space, enlarging the activity space bequeath be Thomas More accomplishable to capture the global found truth. However, if the action space has covered the footing accuracy already, IT is meaningless to continue to spread out the action space. On the contrary, a cosmic number of candidate actions can hit the decision to be more difficult. We report the results for apiece dataset in the supplementary.

www.frontiersin.org

FIGURE 7. Effects of the different settings in activity blank sizing.

5 Conclusion and Rising Work

In this paper, we investigated the Quantum Finance Theory's application in building an end-to-end day-business deal RL bargainer. With a QPL inspired probabilistic departure-and-profits control for the order settlement, our model substantiate the profitability and robustness in the intraday trading task. Experiments reveal our QF-TraderNet outperforms other baselines. To perform intraday trading, we assumed the ground state in t-th daytime is available for QF-TraderNet in this forg. I interesting future work will be combining QF-TraderNet with the progressive forecasters to execute real-time trading by a forecaster-trader framework in which a forecaster predicts the opening price in t-th day for our QF-TraderNet to perform trading.

Information Availability Statement

The original contributions presented in the study are included in the article/Supplementary Real, further inquiries can be directed to the corresponding author.

Author Contributions

YQ: Conceptualization, Methodological analysis, Carrying out and Experiment, Validation, Semi-formal analysis. Writing and Editing. YQ: Carrying out and Experiment, Editing. YY: Visualization. Execution and Experiment. ZC: Implementation and Experiment. RL: Supervision, Reviewing and Editing.

Funding

This report was suspended by Research Grant R202008 of Beijing Normal University-Hong Kong Baptist University United Internationalist College (UIC) and Key Laboratory for AI and Multi-Model Data Processing of Department of Educational activity of Guangdong Province.

Conflict of Interest

The authors declare that the research was conducted in the petit mal epilepsy of some commercial or financial relationships that could be construed as a potential dispute of interest.

Publisher's Note

All claims denotive in that article are entirely those of the authors and do not necessarily represent those of their affiliated organizations, Oregon those of the newspaper publisher, the editors and the reviewers. Any product that may be evaluated therein article, operating theater exact that whitethorn be made by its manufacturer, is not secure or endorsed by the publishing firm.

Acknowledgments

The authors highly appreciate the provision of computer science equipment and facilities from the Division of Science and Technology of Beijing Normal University-Hong Kong Baptist University United International College (UIC). The authors also wish to give thanks Quantum Finance Calculate Revolve around of UIC for the Rdanadenylic acid;D supports and the provision of the platform qffc.org for system examination and rating.

Footnotes

1 r in here denotes the advantage of RL agent, rather than the previous price return r(t) in the QPL evaluation

References

Chen, C., Zhao, L., Bian, J., Xing, C., and Liu, T.-Y. (2019). "Investment Behaviors Force out Tell what exclusive: Exploring Stock Intimate Properties for Descent Trend Prediction," in Proceedings of the 25th ACM SIGKDD International Conference connected Cognition Discovery danamp; Information Mining, Anchorage, Alaska, August 4–8, 2022, 2376–2384.

Google Scholar

Chen, J., Luo, C., Pan, L., and Jia, Y. (2021). Trading Scheme of Structured Mutual Fund Based on Deep Learning Meshwork. Proficient Syst. Appl. 183, 115390. doi:10.1016/j.eswa.2021.115390

CrossRef Full Text | Google Bookman

Colby, R. W., and Meyers, T. A. (1988). The Encyclopedia of Specialised Market Indicators. Homewood, IL: Dow-Jones Industrial Average-Irwin.

Google Learner

Dempster, M. A. H., and Leemans, V. (2006). An Automated Fx Trading Organisation Using Adaptive Reinforcement Learning. Adept Syst. Appl. 30, 543–552. Department of the Interior:10.1016/j.eswa.2005.10.012

CrossRef Full Text | Google Scholar

Deng, Y., Bao, F., Kong, Y., Ren, Z., and Dai, Q. (2016). Deep Flat-footed Reinforcement Learning for Financial Signal Representation and Trading. IEEE Trans. Neural Netw. Memorise. Syst. 28, 653–664. Interior:10.1109/TNNLS.2016.2522401

PubMed Notional | CrossRef Full Text | Google Scholar

Giudici, P., Pagnottoni, P., and Polinesi, G. (2020). Network Models to Raise Automatic Cryptocurrency Portfolio Management. Front. Artif. Intell. 3, 22. doi:10.3389/frai.2020.00022

PubMed Abstract | CrossRef Full Textual matter | Google Scholar

Giudici, P., Percolate, T., and Pagnottoni, P. (2021). Libra or Librae? Field goal Supported Stablecoins to Mitigate Abroad Exchange Volatility Spillovers. Finance Res. Lett., 102054. doi:10.1016/j.frl.2021.102054

CrossRef Full Text | Google Scholar

Huang, D.-j., Zhou dynasty, J., Li, B., Hoi, S. C. H., and Zhou, S. (2016). Robust Median Retrogression Strategy for Online Portfolio Option. IEEE Trans. Knowl. Information Eng. 28, 2480–2493. doi:10.1109/tkde.2016.2563433

CrossRef Full Textbook | Google Scholar

Lee, R. S. (2019). Dynamical system Type-2 Transient-Fuzzy Distant Neuro-Oscillatory Network (Ct2tfdnn) for Worldwide Financial Prediction. IEEE Trans. Hairy Syst. 28 (4), 731–745. Department of the Interior:10.1109/tfuzz.2019.2914642

CrossRef Full Text | Google Scholar

Lee, R. (2020). Quantum Finance: Intelligent Forecast and Trading Systems. Capital of Singapore: Springer.

Google Scholar

Li, Z., Yang, D., Zhao, L., Bian, J., Qin, T., and Liu, T.-Y. (2019). "Personalized Indicator for All: Broth-forward Technical Indicator Optimisation with Stock Embedding," in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery danamp; Data Minelaying, Anchorage, AK, Honorable 4–8, 2022, 894–902.

Google Scholar

Marques, N. C., and Gomes, C. (2010). "Maximus-ai: Using Elman Neural Networks for Implementing a Slmr Trading Scheme," in Supranational Conference on Knowledge Science, Engineering and Management, Belfast, United Realm, September 1–3, 2010 (Impost), 579–584. doi:10.1007/978-3-642-15280-1_55

CrossRef Full Text | Google Student

Meng, X., Zhang, J.-W., Xu, J., and Guo, H. (2015). Quantum Spatial-Periodic Harmonic Model for Each day price-modest Old-hat Markets. Physica A: Stat. Mech. its Appl. 438, 154–160. doi:10.1016/j.physa.2015.06.041

CrossRef Full Text edition | Google Scholar

Mohan, S., Mullapudi, S., Sammeta, S., Vijayvergia, P., and Anastasiu, D. C. (2019). "Stemm price Prediction Using News Sentiment Depth psychology," in 2019 IEEE 5th International Group discussion on Big Data Computing Service and Applications (BigDataService), Newark, CA, April 4–9, 2022, 205–208. doi:10.1109/BigDataService.2019.00035

CrossRef Awash Text | Google Learner

Dwight Lyman Moody, J. E., and Saffell, M. (1998). "Reinforcement Learning for Trading," in Advances in Neural Data Processing Systems. Cambridge, MA: Massachusetts Institute of Technology Press, 917–923.

Google Scholar

Neely, C. J., Rapach, D. E., Tu, J., and Zhou, G. (2014). Forecasting the Fairness Risk Premium: the Role of Commercial Indicators. Manage. Sci. 60, 1772–1791. Interior:10.1287/mnsc.2013.1838

CrossRef Full Text | Google Student

Peralta, G., and Zareei, A. (2016). A Network Approach to Portfolio Selection. J. Empirical Finance 38, 157–180. doi:10.1016/j.jempfin.2016.06.003

CrossRef Full Text | Google Scholar

Pichler, A., Poledna, S., and Thurner, S. (2021). General Risk-Efficient Asset Allocations: Minimization of Systemic Run a risk American Samoa a Network Optimization Problem. J. Financial Stab. 52, 100809. doi:10.1016/j.jfs.2020.100809

CrossRef Full Text | Google Scholar

Resta, M., Pagnottoni, P., and Diamond State Giuli, M. E. (2020). Field Psychoanalysis connected the Bitcoin Market: Trading Opportunities or Investors' Pit? Risks 8, 44. doi:10.3390/risks8020044

CrossRef Full Text | Google Assimilator

Sutton, R. S., and Barto, A. G. (2018). Reinforcement Encyclopedism: An Introduction. Cambridge, Mummy: MIT press.

Google Scholar

Sutton, R. S., McAllester, D. A., Singh, S. P., and Mansour, Y. (2000). "Policy Gradient Methods for Strengthener Learning with Function Approximation," in Advances in Neural Information Processing Systems, 1057–1063.

Google Assimilator

Tran, D. T., Iosifidis, A., Kanniainen, J., and Gabbouj, M. (2018). Temporal Aid-Augmented Bilinear Network for Business Time-Series Data Analysis. IEEE Trans. Neural Netw. Get word. Syst. 30, 1407–1418. doi:10.1109/TNNLS.2018.2869225

PubMed Abstract | CrossRef Full Text edition | Google Scholar

Vella, V., and Ng, W. L. (2015). A High-power Fuzzy Money Management Approach for Controlling the Intraday Risk-Adjusted Execution of Army Intelligence Trading Algorithms. Intell. Sys. Acc. Fin. Mgmt. 22, 153–178. doi:10.1002/isaf.1359

CrossRef Full Text | Google Scholar

Wang, J., Wang, J., Fang, W., and Niu, H. (2016). Business Time Series Prediction Using Elman Recurrent Stochastic Neural Networks. Comput. Intell. Neurosci. 2022, 14. doi:10.1155/2016/4742515

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, J., Zhang, Y., Fucus serratus, K., Wu, J., and Xiong, Z. (2019). "Alphastock: A Buying-Winners-And-Selling-Losers Investing Strategy Using Interpretable Sound Reinforcement Attention Networks," in Proceedings of the 25th ACM SIGKDD International Conference on Cognition Discovery danadenosine monophosphate; Data Mining, Anchorage, AK, August 4–8, 2022, 1900–1908.

Google Scholar

Wei, B., Yue, J., Rao, Y., and Boris, P. (2017). A Deep Scholarship Framework for Financial Time Series Using Stacked Autoencoders and Long-Short Term Memory. Plos One 12, e0180944. DoI:10.1371/journal.pone.0180944

PubMed Abstract | CrossRef Fraught Text | Google Scholar

Wold, S., Esbensen, K., and Geladi, P. (1987). Head teacher Component Analysis. Chemometrics Intell. Lab. Syst. 2, 37–52. doi:10.1016/0169-7439(87)80084-9

CrossRef Full Text | Google Scholarly person

Xiong, Z., Liu, X.-Y., Zhong, S., Yang, H., and Walid, A. (2018). Practical Deep Reinforcement Learning Approach for Stock Trading. arXiv preprint arXiv:1811.07522.

Google Scholar

Ye, C., and Huang, J. P. (2008). Non-classical Oscillator Theoretical account for Persistent Fluctuations available Markets. Physica A: Stat. Mech. its Appl. 387, 1255–1263. doi:10.1016/j.physa.2007.10.050

CrossRef Full Text | Google Scholar

high-frequency trading strategy based on deep neural networks pdf

Source: https://www.frontiersin.org/articles/10.3389/frai.2021.749878/full

Posted by: cannonquichishipt.blogspot.com

0 Response to "high-frequency trading strategy based on deep neural networks pdf"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel