Entity-relation extraction is a core task in the field of information extraction. Entity-relation triples extracted from text are the basis for building large-scale knowledge graphs. The traditional pipeline method decomposes entity-relation extraction into two subtasks: named entity recognition and relation extraction. First, an efficient named entity recognizer is built to identify the entity boundaries and types from large-scale unstructured text sentences. Then, the entities and types are used as labels for the data used in the relation extraction task. Finally, the relationship category between two entities is obtained through the relationship extractor and then combined into a structured entity-relation triplet. However, error in the named entity recognition task will affect the performance of the subsequent relation extraction task, which makes the pipeline method problematic because of error accumulation. This is because the labeled data used in the relation extraction task come from the previous named entity recognition task, which will include certain errors, and this will affect the quality of the relation extraction results. In addition, the pipeline method weakens the feature association between the two subtasks, which will lead to redundant entities. The named entity recognition task and relationship extraction task are independently learned and trained, which leads to a lack of interaction between these two subtasks. As a result, the text information is not fully utilized, which becomes the main reason the performance of the pipeline method is limited. Because unstructured text information is not fully employed, the pipeline method has certain limitations in extracting long dependencies between entities, and it is difficult to achieve high performance in the joint extraction model. In practical applications, there are often multiple relationships between entities, but the pipeline method cannot fully consider the global text information, and hence named entity recognition produces redundant entities, which has disadvantages when extracting multiple overlapping relationships. Therefore, when constructing a high-accuracy entity-relation extraction model, the pipeline approach has shortcomings. This paper reviews the research and development of the joint extraction of entity relationships. Furthermore, it briefly clarifies the common shortcomings of four types of joint models based on feature engineering: integer linear programming, card pyramid analysis models, probabilistic graph models, and structured prediction models. Focusing on the joint extraction techniques for entity relationships based on deep learning, the mainstream construction methods of these models are summarized according to the state-of-the-art results reported in recent years. According to the characteristics of the modeling idea, the modeling methods are categorized into three types: multi-module/multi-step, multi-module/single-step, and single-module/single-step models. Multi-module/multi-step modeling methods consist of three main types: entity domain mapping to the relationship domain, relationship domain mapping to the entity domain, and head-entity domain mapping to the relation-tail domain. The common feature of these three types of models is that they divide the extraction of triples into multiple modules, integrate each module by sharing the parameters, and gradually iterate to obtain triples. This approach improves the performance of the joint model and initially solves the problems of the pipeline method. However, because each step uses an independent decoding algorithm, it leads to the accumulation of decoding errors. Moreover, because the redundant errors of each module integrated with shared parameters affect the prediction performance of the others, this results in cascading redundancies. The multi-module-single-step modeling method aims to construct an optimal joint decoding algorithm and obtain the optimal solution to determine the optimal hyperparameters. This method designs a simple and accurate joint decoding algorithm and strengthens the interaction between multiple submodules. Therefore, the impact of decoding errors and cascading redundancies caused by gradual iterations on the performance of the joint model is weakened. However, the separation of the modules still produces redundancy errors, which cause certain limitations. The single-module/single-step modeling method can extract triples from text directly, which effectively alleviates the cascading error and entity redundancy problems of multi-module/multi-step and multi-module/single-step modeling methods. Taking the representative joint models in the high-impact literature as examples, this paper analyzes the modeling idea, advantages, and disadvantages of each model. It also classifies a number of classical models according to common modeling ideas to illustrate trends in the development of entity-relationship joint extraction models. This paper compares and analyzes the performance of the representative single-module, single-step modeling method with multi-module/multi-step and multi-module/single-step models on a public benchmark data set. Moreover, it clarifies the objective trend that the modeling idea of joint extraction models is gradually changing from complex methods based on multi-module/multi-step and multi-module/single-step models to efficient single-module/single-step models Finally, this paper discusses the prospects of research directions in the joint extraction of three-entity relationships. The current mainstream joint model focuses on the entity-relationship extraction task of limited domains, and the open-domain entity-relationship joint extraction task is an urgent problem for future researchers to solve. In practical industrial applications, a text corpus contains multiple types of information, such as timing information. However, most current entity-relationship joint extraction models extract features based on single-text context information, thus ignoring time-series information. If multivariate information such as time-series information could be incorporated, the performance of the joint model would be further improved, and this is a topic of high importance for the future. In addition, there is little research on cross-text entity-relationship joint extraction models, which is also a future research topic in this field. This paper aims to establish a complete deep learning-based view of entity-relationship joint extraction research, which will be helpful to researchers in related fields.