Advancements in adaptive educational technologies, specifically the adaptive learning system, have made it possible to automatically optimize the sequencing of the pedagogical instructions according to the needs of individual learners. The crux of such systems lies in the instructional sequencing policy, which recommends personalized learning material based on the learning experiences of the learner to maximize their learning outcomes. However, limited available information such as cognitive, affective states, and competence levels of the learners ongoing knowledge points servers critical challenges to optimizing individual-specific pedagogical instructions in real-time. Moreover, making such decisions policy for every learner with a unique knowledge profile demands a trade-off between learner current knowledge and curiosity to learn next knowledge point. To address these challenges, this paper proposes a personalized adaptability knowledge extraction strategy (PAKES) using cognitive diagnosis and reinforcement learning (RL). We apply the general diagnostic model to track the current knowledge state of the learners. Subsequently, an RL-based Q-learning algorithm is employed to recommend optimal pedagogical instructions for individuals to meet their learning objectives while maintaining equilibrium among the learner-control and teaching trajectories. The results indicate that the learning analytics of the proposed framework can fairly deliver the optimal pedagogical paths for the learners based upon their learning profiles. A 62% learning progress score was achieved with the pedagogical paths recommended by the PAKES, showing a 20% improvement compared to the baseline model.