Skip to main content

Predicting risk of early discontinuation of exclusive breastfeeding at a Brazilian referral hospital for high-risk neonates and infants: a decision-tree analysis



Determinants at several levels may affect breastfeeding practices. Besides the known historical, socio-economic, cultural, and individual factors, other components also pose major challenges to breastfeeding. Predicting existing patterns and identifying modifiable components are important for achieving optimal results as early as possible, especially in the most vulnerable population. The goal of this study was building a tree-based analysis to determine the variables that can predict the pattern of breastfeeding at hospital discharge and at 3 and 6 months of age in a referral center for high-risk infants.


This prospective, longitudinal study included 1003 infants and was conducted at a high-risk public hospital in the following three phases: hospital admission, first visit after discharge, and monthly telephone interview until the sixth month of the infant’s life. Independent variables were sorted into four groups: factors related to the newborn infant, mother, health service, and breastfeeding. The outcome was breastfeeding as per the categories established by the World Health Organization (WHO). For this study, we performed an exploratory analysis at hospital discharge and at 3 and at 6 months of age in two stages, as follows: (i) determining the frequencies of baseline characteristics stratified by breastfeeding indicators in the three mentioned periods and (ii) decision-tree analysis.


The prevalence of exclusive breastfeeding (EBF) was 65.2% at hospital discharge, 51% at 3 months, and 20.6% at 6 months. At hospital discharge and the sixth month, the length of hospital stay was the most important predictor of feeding practices, also relevant at the third month. Besides the mother’s and child’s characteristics (multiple births, maternal age, and parity), the social context, work, feeding practice during hospitalization, and hospital practices and policies on breastfeeding influenced the breastfeeding rates.


The combination algorithm of decision trees (a machine learning technique) provides a better understanding of the risk predictors of breastfeeding cessation in a setting with a large variability in expositions. Decision trees may provide a basis for recommendations aimed at this high-risk population, within the Brazilian context, in light of the hospital stay at a neonatal unit and period of continuous feeding practice.


Globally, determinants at several levels may affect breastfeeding practices [1]. In environments subject to clinical vulnerability, besides the several known historical, socio-economic, cultural, and individual factors, other components also pose major challenges to breastfeeding [2, 3]. Brazilian studies, selected in a systematic review [4] on breastfeeding determinants, have not investigated factors associated with breastfeeding in high-risk infants. In addition, such studies were based on regression models (Poisson, logistic, Cox) for statistical analysis [4], a technique also broadly used in the international literature on this field [5].

Traditional regression models are often limited in the exploration of the mutual importance of exposures. Thus, machine learning techniques may be able to investigate the network between exposures and eventually develop decision rules for estimating the risk of early discontinuation of exclusive breastfeeding (EBF) in clinical work. Predicting existing patterns and identifying modifiable components, along with existing studies, are important for reaching the best results as early as possible, especially when dealing with vulnerable populations. Studies using methodologies for predicting situations that might lead to early discontinuation of breastfeeding may help design effective decision-making strategies, especially for subgroups facing major challenges in daily clinical practice.

In the present study, a decision tree model was constructed and validated to determine the variables that can predict the pattern of breastfeeding at hospital discharge and at 3 and 6 months of age, in a referral center for high-risk infants.


Design, setting, and study participants

This was a prospective cohort study conducted in Rio de Janeiro, Brazil, at the National Institute of Women, Children and Adolescents’ Health Fernandes Figueira (IFF) of the Oswaldo Cruz Foundation (FIOCRUZ), a public referral hospital for fetuses, neonates, and infants at high risk. This public hospital attends to about 1000 deliveries per year, is accredited as the Baby-Friendly Hospital Initiative (BFHI), and receives newborns and children with congenital malformations or genetic syndromes from all over Brazil.

The study population included all neonates delivered or transferred to the referral center from March 2017 to April 2018. Of the 1200 eligible participants, 154 were excluded due to non-eligibility, 30 could not meet the research assistant, and the other 13 nursing mothers declined to participate in the study. Figure 1 illustrates the flowchart of the selection process of the participants in this study. Details about participants, setting, and procedures have been published elsewhere [6].

Fig. 1
figure 1

Flowchart of participant selection. Note: FIOCRUZ = Oswaldo Cruz Foundation; HIV = Human Immunodeficiency Virus; HTLV = Human T-cell Lymphotropic Virus; IFF = National Institute of Women, Children and Adolescents Health Fernandes Figueira

Data collection

In all, 1003 infants were enrolled in the longitudinal study of breastfeeding conducted in a Brazilian referral center for high-risk fetuses, neonates, and infants. Each infant was followed up for up to 6 months of life. The end of the follow-up period was October 2018.

This study was developed in three phases: (a) in the first phase, data were obtained from interviews with mothers and medical records; (b) in the second phase, the mothers were interviewed during the first visit after hospital discharge; and (c) in the third phase, telephone interviews were conducted every month until the sixth month of the infant’s life. Regarding this last phase, up to 10 telephone contact attempts were made with each participant each month to minimize loss to follow-up. Data were collected through a web application developed for the research, which could be accessed by using a mobile and/or computer with internet access. A control and quality assurance process was established for data collection, as described elsewhere [6].

Data measures

The outcome was investigated every month during telephone interviews and was assessed by the question “During the month preceding the interview, what foods have you offered to your children?” The response categories were mother’s milk, another type of milk, water, tea, juice, fruits, and any other foods. The participants were categorized into four groups for the analysis of the outcome, according to the set of indicators used for assessing breastfeeding practices that reflect the guidelines on breastfeeding: exclusive breastfeeding (EBF), i.e., breastfeeding not supplemented with any other fluids or solid foods; predominant breastfeeding, i.e., breastfeeding supplemented with fluids such as water, tea, or fruit juices but not solid or semi-solid foods; partial breastfeeding (PBF), i.e., breastfeeding supplemented with other types of milk, such as infant formula, and solid or semi-solid foods; and non-breastfed (NBF), i.e., no breastfeeding [7]. Owing to the low prevalence of “predominant breastfeeding” in the third and sixth months, it was not possible to use this category alone in the analysis. Therefore, the categories “exclusive breastfeeding” and “predominant breastfeeding” were combined and renamed as “exclusive or predominant breastfeeding” (EPB).

The covariates used in the analysis represented (a) maternal factors – “maternal education,” “tobacco use during pregnancy,” “parity and previous experience of breastfeeding,” “presence of partner at home,” “household income” (as compared to the reference value of the prevailing monthly minimum wage in Brazil, which is the minimum payment value per month for formal employees, as prescribed by law), “gestational morbidity,” “maternal work and maternity leave,” “maternal age,” and “breastfeeding difficulties”; (b) child-related factors – “multiples at births,” “birthweight,” “gestational age,” “perinatal morbidity,” and “surgical morbidity at birth”; and (c) health service-related factors – “length of hospital stay,” “use of pasteurized donor human milk,” “infant received formula,” “use of cup-feeding,” “skin-to-skin contact in the delivery room,” “place of hospital admission” (maternity ward or neonatal intensive care unit), “breastfeeding advising during prenatal period,” “use of a pacifier,” and “mode of delivery.” In the third and sixth months, the variables “hospital readmission,” “feeding practice at hospital discharge,” and “breastfeeding difficulties in the month prior to the monthly interview” were added.

Data analysis

The first stage involved a bivariate analysis of maternal and neonatal characteristics according to the feeding practices at hospital discharge and at 3 and 6 months of age. The associations were checked by Pearson’s chi-squared tests. When the expected frequency was lower than five in the contingency tables, Fisher’s exact test was applied. The Dunn test was applied for the analysis of variables “length of hospital stay” and “feeding practice” at 3 and 6 months of age. Since the use of p - values is not recommended in large samples [8], confidence intervals (CI) were provided as a measurement of uncertainty, and p - values were considered as additional information. Besides, differences of at least 10 percentage points (pp.) among feeding practices were considered among the included and excluded participants, suggesting a difference.

In the second stage, decision-tree models were adjusted by using the CART algorithm [9] at hospital discharge and in the third and sixth months, with the indicators for assessing breastfeeding practices used as dependent variables. The decision-tree models are machine learning algorithms that define the rules for recursive binary divisions (binary because the node parents are always divided exactly into two child nodes and recursive because the process can be repeated by treating each child node as a parent node), expressed in values or categories of independent variables, with the purpose of defining the prediction of a categorical variable, represented in decision-tree graphs [9].

From the total set of analyzed data, i.e., the “root” of the tree, the algorithm selects predictor variables for each possible partition, the “nodes,” using an impurity measure defined according to the category distribution of the predicted variables in subgroups derived from the possible divisions, generating a “branch” until a minimum number of elements in the subdivision is reached or until there are no gains in prediction [10, 11]. The tree “leaves” represent categories of the recurrent outcome resulting from these divisions.

There are two important reasons to consider variable selection using decision trees when developing risk predictions. First, limiting the number of inputs to be supplied by the user may increase the utilization of a prediction tool. Second, the elimination of variables that are not predictive may improve prediction accuracy [12].

A 10-fold cross-validation process with three repetitions was used for the adjustment of the hyperparameter of the maximum depth for each of the three models from which the most accurate resultant value was selected. The adjusted models were presented in the form of a decision tree for each period and with at least two informative variables.

The tree is designed with graphic boxes and lines. The predictor of major importance is at the top, and the branches are built according to a decreasing hierarchy of importance until it reaches the leaf. Inside each leaf, located in the lower part of the tree, the most frequent feeding practice is highlighted. The second line presents the probability for each outcome category, in the following sequence: EBF (hospital discharge), EPB (third and sixth months), PBF, and NBF. The last leaf line shows the frequency of participants from that branch.

The participants who were lost to follow-up were excluded from the analysis. From the original sample, 75 children (7.5%) considered for the analysis in the first stage of the study (baseline) did not continue after hospital discharge, so they were excluded from the total number of participants.

The R Foundation for Statistical Computing, version 3.5.2, was used to analyze the data. The rpart library [11] was used to fit the decision-tree model; the caret library [13] was used to tune the max depth parameter with 10-fold cross-validation, and the rattle library [14] was used to obtain the decision-tree graphs. This study was approved by the Ethics Committees at IFF/FIOCRUZ, Brazil (Protocol Number: 1.930.996–2017).


The prevalence of exclusive breastfeeding at discharge was 65.2% (95% CI 62.2,68.3), and 51% at 3 months (95% CI 47.1,54.8); 20.6% (95% CI 16.5,25.0) of the participants were still exclusively breastfed at 6 months postpartum. A few mothers maintained predominant breastfeeding for 3 months (7.1%; 95% CI 3.2,11.0) and 6 months (9.3%; 95% CI 5.2,13.7); therefore, the EPB category had a higher proportion of infants from the “exclusive breastfeeding” category than from the “predominant breastfeeding” category.

Table 1 shows the wide variability in mother and infant characteristics according to the feeding practice at discharge and at 3 and 6 months. The mothers had a mean age of 27 years, ranging from 13 to 46 years; nearly all mothers had planned to breastfeed, and it is important to highlight that over 50% of mothers had some difficulty with breastfeeding before discharge.

Table 1 Characteristics of the participants stratified by feeding practice and period. Rio de Janeiro, Brazil, 2018

Of the infants, 17 (1.7%) had extremely low birthweight, 21 (2.1%) had very low birthweight, 159 (15.9%) had low birthweight, 226 (22.5%) were preterm, and 149 (14.9%) were multiples (twins, triplets, and quadruplets).

Further, 32% of the infants were admitted to the neonatal or neosurgical intensive care unit (NICU), for a mean of 11 days (ranging from 2 to 150 days); 417 (42%) had perinatal morbidity, of which 189 (18.8%) had surgical anomalies and 11 (1.1%) had genetic syndromes such as Down, Werdnig-Hoffmann, Turner, and Beckwith Wiedmann syndromes.

After reassessing all the sample for data checking and disregarding cases with missing data in the three periods of the study, the analysis included data on 757 participants at hospital discharge, 526 participants in the third month, and 459 participants in the sixth month. When assessing the groups of participants who were included in the study and those who were excluded due to missing data, there were differences in the social determinants of “maternal age,” “maternal work and maternity leave,” and “maternal education” between these groups (Additional file 1).

The median “length of hospital stay” gradually increased from EBF to NBF during the three analyzed periods. The median increment in the NBF group (43 days) was 10-fold greater than that observed in the EBF group (4 days) at discharge and approximately two-to-three times greater in the third month (EPB median = 3 days; NBF median = 9.5 days) and the sixth month (EPB median = 3 days; NBF median = 8.5 days) (Fig. 2).

Fig. 2
figure 2

Boxplot of median length of hospital stay regarding feeding practice at hospital discharge, in the third and in the sixth month of life. Note: EBF = Exclusive Breastfeeding; EPB = Exclusive or Predominant Breastfeeding. PBF = Partial Breastfeeding. NBF = Non-Breastfed. The length of hospital stay was measured in days

The mean accuracy of the fitted model on 10-fold cross-validation of the decision tree for the feeding practice was 83% at discharge (Fig. 3), 63% at 3 months (Fig. 4), and 50% at 6 months (Fig. 5).

Fig. 3
figure 3

Decision-tree of 757 children at hospital discharge, Rio de Janeiro, Brazil, 2018. Note: EBF = Exclusive Breastfeeding; PBF = Partial Breastfeeding. NBF = Non-Breastfed. DHM = Donor human milk. Y.O. = years old. NICU = Neonatal intensive care unit. The length of hospital stay was measured in days

Fig. 4
figure 4

Decision-tree of 526 children at 3 months, Rio de Janeiro, Brazil, 2018. Note: EPB = Exclusive or Predominant Breastfeeding. PBF = Partial Breastfeeding. NBF = Non-Breastfed. DHN = Donor human milk. ML = Maternity leave. Y.O. = years old. PEBF = Previous experience of breastfeeding. The length of hospital stay was measured in days

Fig. 5
figure 5

Decision tree of 459 children at 6 months, Rio de Janeiro, Brazil, 2018. Note: Household income (expressed in comparison to a reference value of two Brazilian monthly minimum wages at the time of the perinatal interview). ‘Minimum wage’ refers to the monthly minimum wage, as established by law, for formal employees in Brazil. []; []. PBF=Partial breastfeeding. NBF = Non-Breastfed. The length of hospital stay was measured in days

At hospital discharge, the decision tree defined the “length of hospital stay” as the most important predictor of breastfeeding practice. When considering a length of hospital stay shorter than 16 days, the highest prevalence of EBF was observed (96%) in newborns who were not cup fed; among infants who were cup fed and in a maternity ward, the prevalence of EBF was 91%; among those cup fed with pasteurized donor human milk and in the NICU, the EBF percentage dropped by 40 percentage points (pp) with the use of a pacifier (i.e., the rate for no use of a pacifier was 69% and that for use of a pacifier was 26%); and for children who were cup fed and did not receive pasteurized human milk, PBF was prevalent, at a rate of 90% at hospital discharge (Fig. 3).

The prevalence of EBF was 78%, among infants who stayed in the hospital for 16–42 days and were not fed with pasteurized human milk. Within the group that was fed pasteurized human milk, PBF was prevalent in among mothers aged 20–34 years (67%). Among the younger and older mothers, when cup feeding was used, PBF was highly prevalent (47%), followed by EBF (40%); and when cup feeding was not used, the exclusive use of infant formula was prevalent (85%), where only 15% were still breastfed at hospital discharge.

Regarding the length of hospital stay of 43 days or more, NBF was prevalent at hospital discharge (78%), a branch not explained by any other predictor (Fig. 3).

In the third month of life, four variables that did not explain breastfeeding at hospital discharge were identified in the decision tree: “multiple births,” “maternal work and maternity leave,” “parity and previous experience of breastfeeding,” and “feeding practice at discharge.” The infants were divided into nine groups determined by eight nodes with 63% accuracy. EPB practice was predominant in four groups, comprising 72% of the participants. The probability of EPB ranged from 0 to 72% among the nine groups. The length of hospital stay remained an important predictor of the outcome, and multiples at births was highlighted as the most important predictor.

Among newborns who were multiples at births, PBF was frequent (58%), followed by EPB (25%). In singleton births with length of hospital stay shorter than 21 days, EPB was prevalent (varying from 22 to 72%) for any working condition, maternal age, parity, and when there was no supplementation with pasteurized donor human milk during the hospital stay. However, among women who worked at home, there was a drop in the prevalence of EPB among primiparous women as compared to among multiparous women (22 and 64%, respectively). The drop in EPB was also observed among infants born to older women (aged 35 years or older) who had been hospitalized for a period from 4 to 20 days and among infants supplemented with pasteurized human milk during the hospital stay. In this group of infants, the probability of EPB was half of that of the group that was not supplemented with pasteurized human milk (33 and 62%, respectively) (Fig. 4).

Hospital stay duration of 21 days or longer resulted in a low prevalence of EPB in the third month of life, varying from 0 to 29%. In this branch, breastfeeding was maintained in infants who were exclusively or partially breastfed at hospital discharge although most of them had already received infant formula (57%). The full discontinuation of breastfeeding, along with the use of infant formula, during the hospital stay resulted in the absence of EPB (0%) and a high prevalence of NBF (83%) (Fig. 4).

In the sixth month, the most accurate tree (54%) indicated that the length of hospital stay was the sole predictor of breastfeeding, and PBF and NBF were prevalent among children with a length of hospital stay of, respectively, < 18 days and ≥ 18 days. The second most accurate tree (50%) in the cross-validation analysis and with at least two predictive determinants is the one presented in Fig. 4. Infants were divided into four groups, formed by three nodes. Most of the sample belonged to two groups in which PBF was prevalent (83% of the participants). The probability of EPB ranged from 5 to 34% in the four groups.

In the sixth month of life, the length of hospital stay was still the most relevant predictor of feeding practice (the root node) as shown by the data. Among infants with a length of hospital stay shorter than 18 days, the prevalence of EPB varied from 5 to 34%; in the group of non-multiple pregnancies, PBF was prevalent (55%) followed by EPB (34%); in cases of multiple pregnancies, the change from PBF to NBF was found to be motivated by the increment in income and the prevalence of EPB dropped from 20 to 5%; and among infants with a long duration of hospital stay (of 18 days or longer), NBF was prevalent, and EPB was 14% (Fig. 5).


The prevalence of EBF was 65.2% at discharge; 51% at 3 months; and 20.6% at 6 six months. It is important to highlight 48.6% of the infants continued breastfeeding (PBF) in the sixth month. In the studied cohort, the analyzed components affected the risk prediction in different ways at different moments of an infant’s life (at hospital discharge, at 3 and at 6 months). In the three periods mentioned above, the length of hospital stay was relevant to the feeding practice. Besides the mother’s and child’s characteristics (multiples at births, maternal age, and parity), the social context, work, feeding practice during hospitalization, and several hospital practices and policies on breastfeeding influence the breastfeeding rates.

The length of hospital stay, a highlighted component in all periods, is a proxy for the severity of the child’s situation and the effectiveness of the provided care. The mother-infant separation [15, 16] may interfere with the recovery and negatively impact the hospital stay period [17]. Preterm newborns with low birthweight generally have long lengths of hospital stay that increase their vulnerability to negative outcomes and potentially affect the life trajectory of survivors [17, 18].

Previous studies [3, 19, 20] have shown that neonates with prolonged length of hospital stay are less likely to be breastfed than those with short lengths of stay. Thus, long lengths of hospital stay must involve a detailed exposition of hospital practices and special breastfeeding support and guidance to mothers of high-risk newborns in order to improve breastfeeding rates. Some studies show that the greater the rate of breastfeeding in the NICU, the shorter the length of hospital stay [21] and the higher the cost savings [22, 23].

This study highlights the need to implement hospital practices to promote breastfeeding in hospitals that care for high-risk newborns and support the expansion of the BFHI and efforts within the scope of public health policies to ensure that human milk banks (HMBs) fulfill their role as agents of promotion, protection, and support for breastfeeding (with special emphasis on the risk segment of neonatal care), so that a long hospital does not adversely affect the rates of breastfeeding.

On evaluating the hospital stay tree, the change in the predominance of breastfeeding practice from EBF to PBF only regarded the use of a pacifier among neonates hospitalized in the NICU, and the change from PBF to NBF regarded the non-use of cup feeding among infants with long lengths of hospital stay. The use of a pacifier and the non-use of cup feeding of human milk were predictors that negatively affected breastfeeding in the group of newborns who received supplementation with pasteurized donor human milk.

During hospital stay, some components may facilitate or hinder the early establishment of EBF. Our results are similar to those of other findings regarding the use of cup feeding, which improves EBF rates at discharge, even in preterm babies and those with low birthweight [24,25,26]. This may be due to the similarity in the muscle activity in the orofacial region of infants who are breastfed and cup fed [27, 28].

Our data show that the use of human milk during the length of hospital stay resulted in EBF at discharge. When supplements are required or desired, human milk provided by the mother [29] or by an HMB [30] offers several benefits to hospitalized high-risk newborns [2, 21, 31,32,33]. There are well-documented general and systemic benefits [1] as well as specific benefits of human milk for high-risk newborns, such as protection from necrotizing enterocolitis, retinopathy of prematurity, and bronchopulmonary dysplasia, among others [33,34,35]. All these specific benefits also impact the length of hospital stay.

The use of a pacifier was found to be a predictor of early termination of EBF at discharge. Studies have shown that the use of a pacifier may be a risk factor for the early discontinuation of EBF [4, 36, 37] and that the association is related to the time it was introduced and the frequency of use [38]. This happens even among mothers who are highly motivated to breastfeed [39]. Minimizing the use of a pacifier during the transition process of the newborn from tube feeding to breastfeeding is associated with early exclusive breastfeeding [3, 40].

Breastfeeding practice during hospital stay was one of the major predictors of the continuation of this behavior in the third month. A recent study [1, 41] adapted the determinants of breastfeeding practice by highlighting the chronology of breastfeeding indicators; the study showed that to ensure consistent practice, the practice must be followed at different moments (from the establishment of this practice in the first hour to the second year of life).

Another important predictor in the third and sixth months was multiple pregnancy. A previous study [42] showed that twin newborns are not breastfed at the same rate as single newborns and have a higher risk of early weaning.

A change in the feeding practice was noticed in the decision tree in the third month in relation to hospital discharge and supplementation with human milk (during hospital stay). In order to better understand this prediction, the characteristics of 24 children in this group were explored (average length of hospital stay = 9 days): 15 were born with perinatal morbidity, 4 were preterm, 13 remained hospitalized in the maternity ward, 11 remained hospitalized in the NICU, none of them used a pacifier during hospital stay, 22 were cup fed during hospital stay, 16 did not have skin-to-skin contact, and 13 mothers had difficulties in breastfeeding in the last month.

Feeding supplementation negatively interferes with the decision to breastfeed, especially in primiparous or elder women (35 years old or over) [43]. Once supplements are introduced during the length of hospital stay, regardless of the type of milk prescribed, women start questioning their capacity to breastfeed [44]. As a result, there is a high tendency to offer supplements at home. The advice and practices of healthcare professionals influence breastfeeding practices [1].

Long length of hospital stay was a predictor of EBF discontinuation [3, 19, 20]. When there is risk or potential risk at birth, the longer length of hospital stay must be used to expose the mother-infant dyad to favorable hospital practices for breastfeeding [3]. Besides the generic know-how of the healthcare providers, high-level expertise in breastfeeding, experience, and specific skills are the foundations of proper management of vulnerable neonates.

The exclusive breastfeeding rates under 6 months in a high-risk setting were not correlated with overall national breastfeeding rates. The prevalence of EBF in Brazil was approximately 40% among infants aged under 6 months [45]. In this study, the prevalence of EBF at 6 months was 20.6%, which is slightly higher than the prevalence of 14.5% observed in the Pelotas cohort [46] and of 13% in the cohort of preterm babies in Denmark [47]. In the present study, the prevalence of EBF among high-risk newborns was similar to that among low-risk newborns reported in previous studies. Breastfeeding competence and behavior are not developed by factors such as the presence or absence of risk at the time of birth, but instead, they are affected by several determinants related to the mothers, infants, health systems and services, and healthcare providers. The breastfeeding rates in the highlighted studies, although similar to each other, are below international recommendations [48].

Income was found to be a predictor of the analyzed outcomes only in the sixth month. Partial breastfeeding was more common among poorest mothers with multiple pregnancies than among mothers with a household income higher than twice the monthly minimum wages (over $576). Financially well-equipped mothers are highly likely to use formulas as a result of marketing pressure and economic well-being [46].

We built an analysis model that provided a robust classification of factors predicting the feeding practice for each infant with an accuracy ranging between 50 and 83%, so it can be used for quick decision making. Although prediction models for breastfeeding have been developed and widely applied, most of them are based almost exclusively on parametric or semi-parametric statistical methods, which rely on restrictive model assumptions. In this paper, we proposed the use of a decision-tree method, which is a completely nonparametric machine learning method for accurate prediction. In addition, in clinical practice, decision trees may be a suitable alternative to traditional statistical methods, since they allow the analysis of interactions between various risk components, including those not known previously. Therefore, this study ranked a set of predictors for the statistical modeling of breastfeeding determinants in hospitals that care for high-risk newborns. The predictive capacity of the model described was linked to the pre-processing techniques carefully adopted in the data analysis stage and sought to deal with problems such as missing data, outliers, and multicollinearity of predictor variables.

As far as we know, this longitudinal study is among the few based on data about breastfeeding rates in high-risk hospitals in Latin America. This is the first Brazilian study that applied machine learning models to predict breastfeeding in a cohort of infants delivered at a high-risk hospital.

The main limitation of this analysis was the selection bias related to the social determinants. The support network was not assessed in this study, which could possibly explain some results. Another limitation refers to the joint analysis of the categories “predominant breastfeeding” and “exclusive breastfeeding” due to the low frequency in the former (7 and 9% in the third and sixth months, respectively). Another limitation could be that public health hospitals mainly serve the low-income population, despite free, universal healthcare being available for all citizens since the creation of the Unified Health System (SUS) in 1988 by the Brazilian Federal Constitution. However, this pattern was not confirmed in our study, since more than half of the participants (60%) had a household income higher than $576 a month, most likely because of the fact that this hospital is a national referral center for high-risk infants. It is relevant to mention that these outcomes pertain to a single center and may not be suitable for generalization to the larger population in Brazil or in other countries.


This study provides a better understanding of the predictors of breastfeeding cessation in settings with a wide range of expositions. This study found that the length of hospital stay was the main determinant of breastfeeding practice throughout the 6 months of life, and multiple pregnancy was an important predictor of this practice in the third and sixth months. Individual determinants, based on social context, employment prospects, breastfeeding practice during hospitalization, and the health system were important predictors of this practice.

The combination algorithm of the decision trees is a practical tool that can be used to predict the groups at risk of early discontinuation of EBF and provide effective and timely interventions in order to ensure prolonged and high rates of breastfeeding.

Our results suggest that implementing breastfeeding promotion policies in hospitals for high-risk infants can help overcome the difficulties related to breastfeeding among these infants. Our findings may also provide a basis for country-level recommendations for this population.

Availability of data and materials

All relevant data are in this paper. The datasets generated and/or analyzed during the current study are not publicly available due to guidelines from The Ethics Committees, to restrict the full data disclosure since this could compromise participant confidentiality. However, research data are available from the corresponding author on reasonable request. We welcome data analysis and publication collaboration through specific research proposals sent to the lead researcher and her co-tutors. Additional information can be obtained by sending an e-mail to



Exclusive Breastfeeding


Exclusive or Predominant Breastfeeding


Partial Breastfeeding


Oswaldo Cruz Foundation


Human Immunodeficiency Virus


Human Milk Bank


Human T-cell Lymphotropic Virus


National Institute of Women, Children and Adolescents Health Fernandes Figueira




Neonatal Intensive Care Unit


World Health Organization


  1. Rollins NC, Bhandari N, Hajeebhoy N, Horton S, Lutter CK, Martines JC, et al. Why invest, and what it will take to improve breastfeeding practices? Lancet. 2016;387(10017):491–504.

    PubMed  Article  Google Scholar 

  2. Renfrew M, Craig D, Dyson L, McCormick F, Rice S, King S, et al. Breastfeeding promotion for infants in neonatal units: a systematic review and economic analysis. Health Technol Asses. 2009;13(40):1–146 iii-iv.

    CAS  Article  Google Scholar 

  3. Maastrup R, Bojesen SN, Kronborg H, Hallström I. Breastfeeding support in neonatal intensive care: a national survey. J Hum Lact. 2012;28(3):370–9.

    PubMed  Article  Google Scholar 

  4. Boccolini CS, de Carvalho ML, de Oliveira MIC, Boccolini CS, de Carvalho ML, de Oliveira MIC. Factors associated with exclusive breastfeeding in the first six months of life in Brazil: a systematic review. Revista de Saúde Pública. 2015;49:91.

    PubMed Central  Article  Google Scholar 

  5. Behzadifar M, Saki M, Behzadifar M, Mardani M, Yari F, Ebrahimzadeh F, et al. Prevalence of exclusive breastfeeding practice in the first six months of life and its determinants in Iran: a systematic review and meta-analysis. BMC Pediatr. 2019;19:384.

    PubMed  PubMed Central  Article  Google Scholar 

  6. Silva MDB, Oliveira RVC, Braga JU, Almeida JAG, Melo ECP. Breastfeeding patterns in cohort infants at a high-risk fetal, neonatal and child referral center in Brazil: a correspondence analysis. BMC Pediatr. 2020;20:372.

    PubMed  PubMed Central  Article  Google Scholar 

  7. World Health Organization (WHO). Indicators for assessing infant and young child feeding practices. Washington, D.C.: World Health Organization (WHO); 2008.

    Google Scholar 

  8. Wasserstein RL, Lazar NA. The ASA statement on p-values: context, process, and purpose. Am Stat. 2016;70(2):129–33.

    Article  Google Scholar 

  9. Breiman L, Friedman JH, Richard AO, Stone CJ. Classification and regression trees Boca Raton: Chapman & Hall/CRC; 2017.

    Book  Google Scholar 

  10. Tan PN, Steinbach M, Karpatne A, Kumar V. Introduction to data mining, Pearson. 2nd ed; 2019.

    Google Scholar 

  11. Therneau TM, Atkinson EJ, Foundation M. An introduction to recursive partitioning using the RPART routines. 2019. Available from:

    Google Scholar 

  12. Cafri G, Li L, Paxton EW, Fan J. Predicting risk for adverse health events using random forest. J Appl Stat. 2018;45(12):2279–94.

    Article  Google Scholar 

  13. Kuhn M, Wing J, Weston S, Williams A, Keefer C, Engelhardt A, et al. Classification and regression training. The caret package, 2020. Available from:

    Google Scholar 

  14. Graham W. Data mining with rattle and R: the art of excavating data for knowledge discovery. New York: Springer; 2011.

    Google Scholar 

  15. Maia C, Brandão R, Roncalli A, Maranhão H. Length of stay in a neonatal intensive care unit and its association with low rates of exclusive breastfeeding in very low birth weight infants. J Maternal Fetal Neonatal Med. 2011;24(6):774–7.

    Article  Google Scholar 

  16. Kirchner L, Jeitler V, Waldhör T, Pollak A, Wald M. Long hospitalization is the most important risk factor for early weaning from breast milk in premature babies. Acta Paediatr. 2009;98(6):981–4.

    PubMed  Article  Google Scholar 

  17. World Health Organization. Hospital care for mothers and newborn babies: quality assessment and improvement tool. 2nd ed. Washington, D.C.: World Health Organization (WHO); 2014.

    Google Scholar 

  18. Blencowe H, Cousens S, Chou D, Oestergaard M, Say L, Moller A-B, et al. Born too soon: the global epidemiology of 15 million preterm births. Reprod Health. 2013;10(Suppl 1):S2.

    PubMed  PubMed Central  Article  Google Scholar 

  19. Dall’Oglio I, Salvatori G, Bonci E, Nantini B, D’Agostino G, Dotta A. Breastfeeding promotion in neonatal intensive care unit: impact of a new program toward a BFHI for high-risk infants. Acta Paediatr. 2007;96(11):1626–31.

    PubMed  Article  Google Scholar 

  20. Bicalho-Mancini PG, Velásquez-Meléndez G. Exclusive breastfeeding at the point of discharge of high-risk newborns at a neonatal intensive care unit and the factors associated with this practice. J Pediatr. 2004;80(3):241–8.

    Article  Google Scholar 

  21. Schanler RJ, Lau C, Hurst NM, Smith EO. Randomized trial of donor human milk versus preterm formula as substitutes for mothers’ own milk in the feeding of extremely premature infants. Pediatrics. 2005;116(2):400–6.

    PubMed  Article  Google Scholar 

  22. Mahon J, Claxton L, Wood H. Modelling the cost-effectiveness of human milk and breastfeeding in preterm infants in the United Kingdom. Health Econ Rev. 2016;6(1):54.

    PubMed  PubMed Central  Article  Google Scholar 

  23. Patel AL, Johnson TJ, Engstrom JL, Fogg LF, Jegier BJ, Bigger HR, et al. Impact of early human milk on sepsis and health care costs in very low birth weight infants. J Perinatol. 2013;33(7):514–9.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  24. Flint A, New K, Davies MW. Cup feeding versus other forms of supplemental enteral feeding for newborn infants unable to fully breastfeed. Cochrane Database Syst Rev. 2016;31(8):CD005092.

    Google Scholar 

  25. Collins CT, Makrides M, Gillis J, McPhee AJ. Avoidance of bottles during the establishment of breast feeds in preterm infants. Cochrane Database Syst Rev. 2008;8(4):CD005252.

    Google Scholar 

  26. World Health Organization. Guidelines on optimal feeding of low birth weight infants in low- and middle-income countries. Washington: World Health Organization (WHO); 2012.

    Google Scholar 

  27. Martins CD, Furlan RMMM, Motta AR, Viana MCFB. Electromyography of muscles involved in feeding premature infants. Codas. 2015;27(4):372–7.

    PubMed  Article  Google Scholar 

  28. França EC, Sousa CB, Aragão LC, Costa LR. Electromyographic analysis of masseter muscle in newborns during suction in breast, bottle or cup feeding. BMC Pregnancy Childbirth. 2014;14:154.

    PubMed  PubMed Central  Article  Google Scholar 

  29. Rede Brasileira de Banco, Leite de Humano, Fundação Oswaldo Cruz. NT47–18: uso do leite humano cru exclusivo em ambiente neonatal. 2018. Available from:

    Google Scholar 

  30. Arslanoglu S, Moro GE, Bellù R, Turoli D, De Nisi G, Tonetto P, et al. Presence of human milk bank is associated with elevated rate of exclusive breastfeeding in VLBW infants. J Perinat Med. 2013;41(2):129–31.

    PubMed  Article  Google Scholar 

  31. Bode L, Mcguire M, Rodriguez JM, Geddes DT, Hassiotou F, Hartmann PE, McGuire MK. It’s alive: microbes and cells in human milk and their potential benefits to mother and infant. Adv Nutr. 2014;5(5):571–3.

    PubMed  PubMed Central  Article  Google Scholar 

  32. Rendón-Macías ME, Castañeda-Muciño G, Cruz JJ, Mejía-Aranguré JM, Villasís-Keever MA. Breastfeeding among patients with congenital malformations. Arch Med Res. 2002;33(3):269–75.

    PubMed  Article  Google Scholar 

  33. Akyüz-Ünsal Aİ, Key Ö, Güler D, Bekmez S, Sagus M, Akcan AB, et al. Retinopathy of prematurity risk factors: does human milk prevent retinopathy of prematurity? Turk J Pediatr. 2019;61(1):13.

    PubMed  Article  Google Scholar 

  34. Villamor-Martínez E, Pierro M, Cavallaro G, Mosca F, Kramer BW, Villamor E. Donor human milk protects against bronchopulmonary dysplasia: a systematic review and meta-analysis. Nutrients. 2018;10(2):238.

    PubMed Central  Article  Google Scholar 

  35. Cacho NT, Parker LA, Neu J. Necrotizing enterocolitis and human milk feeding: a systematic review. Clin Perinatol. 2017;44(1):49–67.

    PubMed  Article  Google Scholar 

  36. Eidelman AI, Eidelman AI. Routine pacifier use in infants: pros and cons. J Pediatr. 2019;95(2):121–3.

    Article  Google Scholar 

  37. Buccini GDS, Pérez-Escamilla R, Paulino LM, Araújo CL, Venancio SI. Pacifier use and interruption of exclusive breastfeeding: systematic review and meta-analysis. Matern Child Nutr. 2017;13(3):e12384.

    Article  Google Scholar 

  38. Mauch CE, Scott JA, Magarey AM, Daniels LA. Predictors of and reasons for pacifier use in first-time mothers: an observational study. BMC Pediatr. 2012;12:7.

    PubMed  PubMed Central  Article  Google Scholar 

  39. Aarts C, Hörnell A, Kylberg E, Hofvander Y, Gebre-Medhin M. Breastfeeding patterns in relation to thumb sucking and pacifier use. Pediatrics. 1999;104(4):e50.

    CAS  PubMed  Article  Google Scholar 

  40. Maastrup R, Hansen BM, Kronborg H, Bojesen SN, Hallum K, Frandsen A, Kyhnaeb A, Svarer I, Hallstrom I. Factors associated with exclusive breastfeeding of preterm infants: results from a prospective national cohort study. PLoS One. 2014;9(2):e89077.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  41. Nieuwoudt SJ, Ngandu CB, Manderson L, Norris SA. Exclusive breastfeeding policy, practice and influences in South Africa, 1980 to 2018: a mixed-methods systematic review. PLoS One. 2019;14(10):e0224029.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  42. Mikami FCF, Francisco RPV, Rodrigues A, Hernandez WR, Zugaib M, de Lourdes Brizot M. Breastfeeding twins: factors related to weaning. J Hum Lact. 2018;34(4):749–59.

    PubMed  Google Scholar 

  43. Chantry CJ, Dewey KG, Peerson JM, Wagner EA, Nommsen-Rivers LA. In-hospital formula use increases early breastfeeding cessation among first-time mothers intending to exclusively breastfeed. J Pediatr. 2014;164(6):1339–45 e5.

    PubMed  PubMed Central  Article  Google Scholar 

  44. Moraes BA, Gonçalves ADC, Strada JKR, Gouveia HG. Factors associated with the interruption of exclusive breastfeeding in infants up to 30 days old. Revista Gaúcha de Enfermagem. 2016;37(spe):e2016–0044.

    Article  Google Scholar 

  45. Brasil. Ministério da Saúde. II Pesquisa de prevalência de aleitamento materno nas capitais brasileiras e distrito federal. Secretaria de Atenção à Saúde. Departamento de Ações Programáticas e Estratégicas. Brasília: Ministério da Saúde; 2010. [cited 2020 Jan 10]. Available from:

  46. Santos IS, Barros FC, Horta BL, Menezes AMB, Bassani D, Tovo-Rodrigues L, Lima NP, Victora CG. Breastfeeding exclusivity and duration: trends and inequalities in four population-based birth cohorts in Pelotas, Brazil, 1982–2015. Int J Epidemiol. 2019;48(Supplement_1):i72–9.

    PubMed  PubMed Central  Article  Google Scholar 

  47. Maastrup R, Hansen BM, Kronborg H, Bojesen SN, Hallum K, Frandsen A, Kyhnaeb A, Svarer I, Hallström I. Breastfeeding progression in preterm infants is influenced by factors in infants, mothers and clinical practice: the results of a national cohort study with high breastfeeding initiation rates. PLoS One. 2014;9(9):e108208.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  48. Walters D, Eberwein JD, Sullivan L, D’Alimonte M, Shekar M. An investment framework for meeting the global nutrition target for breastfeeding. [cited 2020 Jan 10]. Available from:

Download references


We are grateful for our participants’ support. The authors would like to acknowledge developer Vinicius Ramires Leite, who created the web application for cohort. We wish to thank Dr. João Aprígio Guerra de Almeida, coordinator of the Global Network of Human Milk Banks (FIOCRUZ), Brazil, and Dr. Danielle Aparecida da Silva, coordinator of the National reference center of Human Milk Banks (FIOCRUZ) for their support and highly valuable comments. We also acknowledge the colleagues of the Human Milk Bank at IFF/FIOCRUZ for support; and Marlene Assumpção, Alana Kohn, Antonio Azeredo, Rosânea Santos, Flavia Benedicto, Rafaelle Cristine, Pernelle Pastorelli, Silvia Azevedo, Alexia Martins, Taina Gomes, Caroline Lima, Pamela Mourão, Luiza Reis and Camila Chaves for assisting in data collection.


“This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001”. This funding supports the PhD program at ENSP. There was no funding for the study design, data collection, analysis, data interpretation or in writing the manuscript.

Author information

Authors and Affiliations



MDBS, RVCO and ECPM conceptualized the study and designed data collection methods. MDBS collected data. MDBS, RVCO, DSBA and ECPM were the responsible analysts and they commented on the results. MDBS and DSBA were responsible for writing the initial draft of the manuscript, and subsequent drafts were reviewed by all authors listed. All authors had input on interpretation and reporting of study findings. All authors provided approval for the published version of this manuscript.

Corresponding author

Correspondence to Maíra Domingues Bernardes Silva.

Ethics declarations

Ethics approval and consent to participate

This study was approved by the Ethics Committees at IFF/FIOCRUZ, Brazil (Protocol Number: 1.930.996–2017). All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee. Written informed consent was obtained from all mothers over 18 years old included in the study. A parent or guardian was on behalf of any participants under the age of 18 to write informed consent.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Comparison between included and excluded participants due to missing data. Rio de Janeiro, Brazil, 2018.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Silva, M.D.B., de Oliveira, R.d.V.C., da Silveira Barroso Alves, D. et al. Predicting risk of early discontinuation of exclusive breastfeeding at a Brazilian referral hospital for high-risk neonates and infants: a decision-tree analysis. Int Breastfeed J 16, 2 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: