0% found this document useful (0 votes)
146 views7 pages

Turkish Natural Language Processing Studies

Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-1 , December 2019, URL: https://www.ijtsrd.com/papers/ijtsrd29831.pdf Paper URL: https://www.ijtsrd.com/engineering/computer-engineering/29831/turkish-natural-language-processing-studies/yilmaz-ince-e

Uploaded by

Editor IJTSRD
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
146 views7 pages

Turkish Natural Language Processing Studies

Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-1 , December 2019, URL: https://www.ijtsrd.com/papers/ijtsrd29831.pdf Paper URL: https://www.ijtsrd.com/engineering/computer-engineering/29831/turkish-natural-language-processing-studies/yilmaz-ince-e

Uploaded by

Editor IJTSRD
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

International Journal of Trend in Scientific Research and Development (IJTSRD)

Volume 4 Issue 1, December 2019 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470

Turkish Natural Language Processing Studies


Yılmaz İnce, E
Department of Computer Technologies, Isparta University of Applied Sciences, Isparta, Turkey

ABSTRACT How to cite this paper: Yilmaz Ince, E


Natural language processing which is an engineering field which is concerned "Turkish Natural Language Processing
with the design and implementation of computer systems whose main Studies" Published
function is to analyze, understand, interpret and produce a natural language. in International
Literature review was conducted using Turkish natural language studies Journal of Trend in
documentation method. Documentation on the current method of examination Scientific Research
to be reliable scientific research and thesis studies on natural language and Development
processing in Turkey were examined by scanning pages of the thesis of Higher (ijtsrd), ISSN: 2456-
Education. Evaluated in terms of the subjects of the study samples obtained as 6470, Volume-4 | IJTSRD29831
a result of the literature review; Morphological analysis studies, syntactic Issue-1, December
analysis studies, semantic analysis studies and problem analysis criteria were 2019, pp.1122-1128, URL:
determined and presented. www.ijtsrd.com/papers/ijtsrd29831.pdf

KEYWORDS: natural language processing; morphological analysis; syntactic Copyright © 2019 by author(s) and
analysis; semantic analysis; problem analysis International Journal of Trend in Scientific
Research and Development Journal. This
is an Open Access article distributed
under the terms of
the Creative
Commons Attribution
License (CC BY 4.0)
(http://creativecommons.org/licenses/by
/4.0)
1. INRODUCTION
Natural language processing which is defined as an pages of the thesis of Higher Education. Evaluated in terms of
engineering field which is concerned with the design and the subjects of the study samples obtained as a result of the
implementation of computer systems whose main function is literature review;
to analyze, understand, interpret and produce a natural  Morphological analysis studies
language. Applications are developed by using methods such  Syntactic analysis studies
as morphological, syntactic, and semantic analysis with
natural language processing, and rules are created for  Semantic analysis studies
problem situations and solutions are produced by analyzing  Problem analysis studies.
problems (Nabiyev, 2012). Morphological, syntactic and
semantic analysis is used to control and correct spelling 3. TURKISH NATURAL LANGUAGE PROCESSING
errors by using natural language processing methods, STUDIES
computerized translation, information extraction, 3.1. Morphological analysis studies
information retrieval, development of question and answer Güngör (1995) conducted a morphology analysis in Turkish.
systems, summary subtraction and automatic exam scoring. In this study, additional structure and order were examined
and the morphological structure of Turkish was modeled by
Morphological analysis examines the suffixes, the types of computer in extended transition network formation. In
suffixes and the lineage of words, which are the structural addition, the software developed in the study can make spell
features of language. Syntactic analysis explores the checking and correction. Commitment decomposition is the
hierarchical harmony of the sequence of words and the method that enables the analysis of that sentence by
elements of the words that make up the sentence. Semantic detecting binary relations between words within a sentence.
analysis enables the matching of discrete words to
appropriate objects in the database using a corpus of corpus. Eryigit (2006) examined and modeled the loyalty
In addition, semantic analysis is concerned with the decomposition of Turkish. In this study, different types of
development of appropriate models for the integration of parsers have been developed and the performance of the
discrete words. parsers and models have been compared and it has been
shown that the classifier-based discriminatory discriminator-
2. METHOD based discriminatory discriminator has the best results.
Literature review was conducted using Turkish natural
language studies documentation method (Yıldırım & Şimşek, Alkım (2006), designed for Turkish as a shell software for the
2000). Documentation on the current method of examination morphological analysis and dictionary design in natural
to be reliable scientific research and thesis studies on natural language processing for Turkish, supported by interfaces that
language processing in Turkey were examined by scanning allow to change and improve the rules of dictionary and

@ IJTSRD | Unique Paper ID – IJTSRD29831 | Volume – 4 | Issue – 1 | November-December 2019 Page 1122
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
morphology. This software can be used for different success of the language processing algorithms. For this
languages. purpose, a new Formal Automatic Requirement Review tool
has been developed for Turkish software projects.
Karadeniz (2007), in his research aimed to determine the
correct one of the results produced by the morphological 3.2. Syntactic analysis studies
analyzer for Turkish. In this study, the uncertainty Hallaç (2007) developed TurPOS, a rule-based Word Types
distributions of the Turkish words were extracted and then Identifier developed for Turkish, aiming to assign the most
the words were clustered according to their uncertainty appropriate word class to each word in a given Turkish
qualities. Uncertainties are eliminated by writing rules for document. TurPOS uses a collection produced by the
each type of uncertainty. The performance of the study was morphological analysis application as the input document.
tested and it was found that the performance was 82.6%. The system also uses a rule file containing Turkish grammar
rules.
Shylov (2008), two-sided morphological machine translator
application for Turkmen and Turkish. In the research, In study, Agun (2008) presented a graph-based automatic
morphological analysand and morphological productive learning model for learning the syntactic features in Turkish.
practice was conducted for Turkmen and Turkish. Root In this study, the graph model designed by using a collection
dictionaries were created for Turkish and Turkmen. was trained and the correct syntactic tags for a given
sentence were extracted through this model. In the design of
Kışla (2009), in his research, one of the main problems in the the model, the probability based graph model, Hidden
field of natural language processing, morphological analysis Markov Models and graph theory were used. In the present
and word type detection problems presented for the original study, unlike other probability based labeling algorithms and
solutions. For the morphological analysis, which is statistical natural language processing studies, a probability
theoretically known as NPcomplete, the complexity for based diagram model was developed in which Turkish
additive languages has been put forward by a simplified morphological features can be used.
method considering the grammatical features of Turkish. The
method, which uses statistical and rule-based approaches Eker (2009) studied syntactic decomposition, one of the main
together, provides a single result as a result of the analysis problems of natural language processing. The two most
and ensures the elimination of uncertainty. In addition, the common parsing, heap structure parsing and independence
proposed method uses a closed and limited dictionary, which parsing. In the study, parsers were evaluated with textual
is an important feature that differs from other methods. requirements and the chunk structure parser got the highest
score.
Yilmaz (2009), developed the morphological parser to add
the results of the simplification of the words, the full Şentürk (2009), in his study, tried to make a rule-based find-
morphological analysis of verbs, the numerical value of the and-replace function for the Turkish language. The most
date, time, such as analysis of words, abbreviations and important reason for this study is that the writing programs
special names can be modified for the production and are inadequate for the Turkish language. In this research,
additions, especially for Turkish users such as tagging has finite state machines were created and correct analyzes were
features. In addition, different features have been realized made for all root and attachment types. It is ensured that the
with the text shredder that brings the text to be analyzed into suffixes of the found words are analyzed correctly and the
the analysis format, comparison with other analyzers and correct replacements are made with these resolved suffixes.
word derivation modules that produce words from the
additional series. Kutlu (2010) developed a noun phrase extraction system for
Turkish texts. In the study, a weighted restrictive dependent
In study, Savaşçı (2010) determined the degree of parser was used to show the relationship between sentence
completeness of Turkish and English with statistical data, and components and to find noun phrases.
the degree of completeness of Turkish was 0.56 and the
degree of completeness of English was calculated as 0.12. Tahiroğlu (2010) created a web-based compilation dictionary
Therefore, the difficulty level of natural language processing to be used in computer-assisted dictionary studies in his
in Turkish is higher than in the English language. research. MySQL is the database of the compiled dictionary
and PHP is encoded in the web interface. It is thought that the
Aktaş (2010) in order to determine the morphological developed software can provide data for future lexicology
features of languages, a corp that can represent the features research and dictionary writing during the preparation of
of language is required. In this study, Natural Language online, shared dictionaries.
Processing methods for Turkish have been developed by
using a rule-based approach and an infrastructure called A rule-based system has been developed to identify phrases
Rule-Based Automatic Corpus Generation has been created to in Turkish and a software that uses the system has been
realize the methods. prepared. It is provided to define Turkish word structures for
determination of word types with the developed software
Yalçınkaya (2018) identifies and eliminates the errors that (Turna, 2011).
may occur in the software development life cycle at an early
stage and conducts review activities within the requirements Zafer (2011) developed a general sentence analyzer for
management process. The automatic review of the Turkish Turkish languages. The developed analyzer is based on
software requirements and the successful elimination of context-independent grammar rules, morphological analysis
errors are directly proportional to the scope, accuracy and and validity rules. In this study, the grammar of Turkish and

@ IJTSRD | Unique Paper ID – IJTSRD29831 | Volume – 4 | Issue – 1 | November-December 2019 Page 1123
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
Turkmen was examined in terms of computer and the system clarify the meaning of the test processes. In addition, features
was implemented for Turkish. Later, it was adapted to that could be effective in the study were determined for the
Turkmen with partial changes. words. It has been demonstrated how effective these selected
algorithms and features are.
Kazkılınç (2012) the subject of the news text, predicate, place
and time to specify the phrase in the text, the text has been Adalı (2009) worked on the automatic processing of Turkish
tagged. For this purpose, the most dominant subject, documents and extracting information from these
predicate, place and time were selected from the sentences in documents. While traditional information extraction systems
the text. Thanks to these tag information obtained in the treat input text as sequential words, they generally focus on
study, the subject of the text is represented and can be used semantic features, and the clues provided by the document
as a tag in the semantic network and can be used to reach the structure are utilized with the help of the proposed
desired data in search engines. architecture. However, in order to ensure document
consistency, the relationships between the assets in the
3.3. Semantic analysis studies document are tested and the extracted assets are compared
Çakıroğlu (2001) In this study, a model was developed to with the data actually used. Proposed integrated architecture;
realize and solve simple arithmetic problems by computer. Morphological analysis module for Turkish includes
When the semantic analysis was performed, the information document structure analysis module, field ontology, inference
base used was arranged as semantic network. In the solution ontology.
of problems; The meanings of the sentences were determined
through semantic networks and necessary calculations were Kalender (2010) suggests a system that automatically
made in accordance with these meanings. In order to perform generates semantic labels for documents in its research. In
semantic analysis in the processing of Turkish as a natural this study, UNIpedia is to provide a knowledge base
language by computer, a software has been developed and containing contemporary (current words on the web)
the performance level of this software has been determined. references. UNIpedia associates various ontological
knowledge bases with WordNet concepts. Wikipedia and
Orhan (2006), the most appropriate algorithms and features OpenCyc knowledge bases, which contain up-to-date and
that will enable the clarification of the meaning of the words reliable information, are mapped to WordNet concepts. Rule-
in the Turkish text has been discussed. For Turkish, words based heuristics using ontological and statistical properties of
and meaning classes that can be used in word clarification concepts were used to relate knowledge bases.
studies have been created, handwriting of texts to be used in
algorithms has been realized and a conceptual dictionary has Per (2011), in his research, put forward a concept extraction
been prepared and a significant contribution has been made system for Turkish. Due to the fact that Turkish characters do
to the researches in this field. not fit into the computer language and the complex structure
of the Turkish add-on, a preliminary processing step is
Bahadır (2007) conducted the analysis and symbolization of required first. As a result of the preprocessing, only the
arithmetic mathematics problems at the level of primary names of the words that were separated from the affixes
school students in his research. Modules of the application were used. Although the system produced more concepts
developed; morphological analysis of words, determination than it had to produce, it found the concepts of documents
of the suffixes of the words, determination of the pattern of with 51 percent success. Considering the fact that the
questions, solving the problem, expressing the solution in the concepts do not appear exactly in the documents in terms of
form of equations, symbolizing the problem and the solution. structure and the complex structure of Turkish, this result
can be considered quite successful.
Birant (2008), developed by Aktaş (2006) and accuracy rate
is 99% based on the end-of-sentence algorithm and some of Numerous keyword extraction and text summarization
the changes used in this study has made a standardized algorithms in the field of natural language processing, some
structure. In this research, the XML structure is used in the of which we discussed in the study (Güvenç, 2016). To
files that are presented to the user as the result of the understand the methods we started with a research on
targeted morphological study program and the rules of automatic text summarization. The results show that our
interposition between the attachments during morphological summarizing algorithms give the best results on news texts
parsing. The names of the attachment types to be included in and provide less optimal results for short stories.
the XML structures have been systematically shortened
within the framework of a set of rules, allowing the labeling Kazmı (2017) proposed a new methodology based on
of roots and attachments. solution set programming, focusing on fixed coded rules in
our Inspire system and predicting the Interpretable Semantic
Özdemir (2009) worked on the process of meaning Similarities with a rule-based approach. It has shown that the
clarification by selecting the appropriate features and improvements made in the research have obtained similar
algorithms for the words with ambiguity in Turkish texts. In results with the latest technology systems in learning big data
the research, due to the lack of marked texts on which the sets.
studies will be conducted, data consisting of sentences
containing the selected sample words were collected first. In the study, Aktaş (2017) created a Wordnet ontology that
Afterwards, the features that distinguish the meaning for the contains network terms separated from its attachments, and
selected words were determined and Instructional Learning the connection between the two terms in the ontology was
algorithms were applied to the data and the results obtained calculated. The most effective way to name terms is natural
with the evaluation methods were evaluated. NaiveBayes, language processing. It is aimed to search computer network
Kstar, SimpleCart and Bagging algorithms were used to terms in an ontological dictionary and to add the words that

@ IJTSRD | Unique Paper ID – IJTSRD29831 | Volume – 4 | Issue – 1 | November-December 2019 Page 1124
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
are not in the dictionary to the ontology automatically. The different staging algorithms on summarizing performance is
terms in the close-up and up-down relationship are examined in order to observe the Turkish additive structure.
connected with each other as graph structure. All terms were In order to have higher performances, morphological parser
linked to each other and dictionary ontology was created. which uses possible stem and attachment combinations of
Firstly, concept map was used to make correlation and then it words is used in the implemented staging algorithms. These
was automated with natural language processing algorithm. categorized words are examined through different
The algorithm has been prepared by looking at the ontology summarizing methods and the sentences to be included in the
nodes of the 10 words in the dictionary on the right and left summary are determined for each method. Then the results
of the word that are not in the paragraph and are associated produced by these methods were combined to form the final
with the meaning relation close to the top node. summary.

Ergüven (2018) focuses on word representations. In word Delibaş (2008) found the roots of the words in the sentences
representations, previous methods rely on the counting of in the text entered, parsing the suffixes, testing the accuracy
statistics between the word and the words it accompanies, of the word, suggesting words to the misspelled words and
while the current methods are learning-based. In the thesis, adding non-Turkish foreign words to the dictionary on the
the relationship between these two approaches was control of Turkish spelling errors by natural language
investigated. Both approaches used context as normalization processing.
factor. The word representation of a multi-meaning word
includes more than one meaning. To overcome this problem, Yılmaz (2008) aimed at high school students and university
a method has been developed that provides a separate candidates and developed a natural language interface
representation for each meaning of the multi-meaning word. prototype for the semantic network that processes
information about Turkish Universities. When asked
3.4. Problem analysis studies questions in sentences about the developed software, the
Amasyalı (2003) conducted a natural language question answers are reflected on the screen and a program has been
answering system for Turkish. The system first translates the prepared to assist the students' university choice period.
question that the user asks in natural language to the search
engine query and selects possible answer sentences from the Aksoy (2008), in his research using natural language
search engine's results page or pages in the links. The first processing and genetic algorithm has created a three-
five sentences that score the highest possible scores dimensional scene. In this study, an application software that
according to various criteria are communicated to the user. produces 18-dimensional scene interpretation by using
The system was evaluated with 524 questions and was able genetic algorithm application, which takes 18 relations
to answer approximately 43% of the questions when the between objects and objects given within the framework of
search engine's results page was used, and 60% when the meaning defined in natural language, is used as input and
pages in the links of the results page were used. uses the settlement rules defended in the study for
conformity assessment.
An expensive and slow method of manual translation with
fast and high error rate in the study of computerized Kopru (2008) presented a unique approach that integrates
transcription methods are combined (Akman, 2004). Patterns automatic voice recognition and automatic translation
containing the hypotheses about conversations to be written systems for voice translation purposes. The presented
and the output of a speech recognition engine were method is unique in that it includes the first rule-based
transformed into letter-based, requisite, weight-finite status automatic translation system capable of processing audio
receivers, trained in a text collection that corresponded to data in word Web search engines are used to find documents
speech data in terms of content, and combined with a letter- containing the information sought. However, in many cases
based statistical language model. the user needs more specific information than a set of
documents. Question answer systems address this problem.
Eş (2005) studied the subject of filtering that can Question answer systems return explicit answers rather than
automatically sort e-mails into legal or unwanted categories. a set of documents as the answer to a question. In his study,
In this research, it has been tried to obtain effective results in Er (2009) developed a software with a pattern matching
electronic mail filtering business by using some machine approach for answering questions in Turkish singular
learning methods and some ideas of obtaining information in answers.
order to automatically classify electronic mail as legal or
unwanted. Görmez (2009) realized the Turkish text-to-speech system
software with machine learning algorithms. The system has
Yıldırım (2005) was made meaningful by using the statistical two different audio databases consisting of units recorded
natural language processing methods of sms messages. In this directly and interrupted from a continuous conversation, and
study, a solution method has been presented with N-Gram the software allows the units to be disconnected manually
method which is one of the most important statistical natural and automatically from the conversation. The signal
language processing methods. characteristics of the audio signal such as the number of
crossings from zero and the energy of the sound are used for
Tülek (2007) conducted a text summarization study for automatic cutting.
Turkish. Considering the structure of Turkish in the software,
different statistical methods for summarizing a text were Soysal (2010) obtained structured information from Turkish
introduced and implemented with software and their radiology reports using natural language processing and a
suitability to Turkish was discussed. As it is necessary in all field ontology methods. Field ontology was used during
other Turkish information retrieval systems, the effect of information extraction, asset identification and relationship

@ IJTSRD | Unique Paper ID – IJTSRD29831 | Volume – 4 | Issue – 1 | November-December 2019 Page 1125
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
extraction stages. Since Turkish is a morphologically rich methods. In the software, Turkish WordNet is used for
language in the natural language processing processes of the semantic relations and Zemberek is used for natural language
study, the system was used as a morphological analyzer and processing infrastructure. The developed software uses a
these morphological features were utilized in inference rules. hybrid model that includes cosine similarity and integrated
Radiology departments use techniques that visualize latent semantic analysis methods. In order to measure the
patients' bodies and these images are examined by doctors accuracy of the software, a special case study was carried out
and put into plain text reports. For medical information in the Distance Education Program of the Computer
systems, it is important to extract information from these Engineering Department of the Faculty of Engineering at
plain text reports. Süleyman Demirel University. According to the results of the
study, it was found that the software can be used for Turkish
Hadımlı (2011) proposes two methods that can be used in the automatic exam scoring with a success rate of 92%.
processing of Turkish radiology reports, one of which is rule-
based and one is database-based. Unlike previous medical Emotion analysis aims to classify emotions as positive and
natural language processing studies, neither of these methods negative by analyzing emotions and thoughts about a subject
uses a medical dictionary or a medical ontology. Knowledge in the texts (Yelmen, 2016). Attribute selection is often used
extraction is done at the level of determining the medically today to improve classification performance and success.
related phrases and their relationships in the given sentence. Different methods are used in this selection and the aim is to
The aim is to determine the reference performance that select the most important attributes by disabling irrelevant
Turkish features can offer for medical information extraction attributes that affect the success of the classification in the
and access in the absence of other factors. data set. In this way, the success rate can be increased. This
thesis focuses on the selection of attributes from Turkish
Albayrak (2011) investigated the relationship between the texts written in daily spoken language and uses support
use of Turkish and psychological status. In this study, Turkish vector machines, artificial neural networks and centroid
articles were collected from depressed, depressive, anxious based classification algorithms on detailed preprocessed
and non - anxious individuals. Writings were analyzed data. Gini Index, Knowledge Gain and Genetic Algorithm were
according to the use of features such as word, modals, person used as hybrids with 3 different classification algorithms on
pronouns, verbs and nouns which are used most by each tweets belonging to the followers of 3 different GSM
word, each diagnostic group. A program was developed to operators. 100% success was achieved for 3 different GSM
test the differences in the use of interpersonal vocabulary. operators especially when the support vector machines were
The results of the test show that the use of words in Turkish used as hybrids with the genetic algorithm which has an
gives many clues about the psychological state. important place in size reduction and works intuitively.
Çelikkaya (2015) aims to demonstrate the effect of advanced
natural language processing methods on these applications Kaya (2018) author recognition process. 120 Turkish books
and to develop a high-performance personal assistant have been studied by 20 Turkish authors. Character n-gram
application that will work with Turkish natural language was used as the author's stylometry feature and the
input on smartphones and tablets. A virtual assistant is an classification process was done with Naive Bayes classifier
application that contains almost every component of artificial method. First of all, 120 Turkish books were found and
intelligence. The hybrid model developed was 98.30% converted to txt format. Bi-gram, tri-gram and quadri-gram
successful on our test data. Then the parameters that are properties of the authors were calculated by calculating the
meaningful for the specified service are extracted and the frequency from the educational books, and the most common
information is returned from the 3rd party sources for the 200 were stylometric vector spaces of the author. At this
required services. The success of the whole system was point, our system is ready for author recognition process.
determined as 70.79% on average. Author recognition performance of N-gram vector spaces was
measured. As a result of the observations, the bi-gram vector
Gökdeniz (2016) aims to exclude the relationships between space failed. In addition, tri-gram and quadri-gram gave good
brain parts from published articles using natural language results. The best performance yielded quadri-gram with 82%
processing techniques. With a linguistic approach, the performance. At the end of the thesis, all results, complexity
sentences containing the relationships depending on the matrix are given.
patterns were selected, and then the related brain sections
and their relations with each other were extracted by using Keklik (2018) proposes a new rule-based approach for
the loyalty separator and element separator on these automatic question generation. The proposed approach
sentences. In this study, a connection graph showing the focuses on the analysis of both the syntactic and semantic
connections of brain sections with each other is determined structure of a sentence. Although the main purpose of the
by determining the direction of these relationships. After designed system was to produce questions from sentences,
evaluating the system we have developed on the collection of the results of the automatic evaluation showed that the
Whitetext project, the same methods are used to draw and system performed well on paragraphs that required
analyze the connection graph of PVT brain section. PVT is an comprehension skills. As for human evaluations, the designed
important brain part that is believed to have an impact on a system outperformed all other systems and produced the
wide range of functions such as arousal, stimulation, drug most natural (human-like) questions.
seeking behavior and attention. As the results of our study
show, PVT may be a new research focus on brain section Ayata (2018) investigated the efficiency of using machine
behavior evaluation. learning and statistical natural language processing
techniques in solving emotion analysis problem for Twitter
Yılmaz İnce (2016) developed the software for evaluating the data. The thesis covers Turkish tweet emotion analysis,
written exams on the web using natural language processing sector based emotion analysis, English tweet emotion

@ IJTSRD | Unique Paper ID – IJTSRD29831 | Volume – 4 | Issue – 1 | November-December 2019 Page 1126
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
analysis and predicting political orientation. A framework for Doğal Dil Cümlelerinden Görsel Yapılar İnşa Edilmesi.
sector-specific Turkish tweet emotion analysis was proposed Trakya Üniversitesi, Fen Bilimleri Enstitüsü, Yüksek
with the combination of machine learning and statistical Lisans Tezi, 91, Edirne.
natural language processing techniques, and the proposed [5] Aktaş, Ö., 2006. Türkçe için Verimli bir Cümle Sonu
framework was applied to the finance, retail, Belirleme Yöntemi. Akademik Bilişim 2006 - Bilgi
telecommunications and sports sectors. In the political Teknolojileri Kongresi IV, Pamukkale University,
orientation analysis, it was estimated which voters Denizli, Türkiye.
'tendencies according to Twitter messages belong to the [6] Aktaş, Ö., 2010. Rule-based natural language processing
`democratic' or 'republican' classes. methods for Turkish, Dokuz Eylül Üniversitesi / Fen
Bilimleri Enstitüsü, 211, İzmir.
Bozyiğit (2019) presents an automatic concept definition [7] Aktaş, Y., 2017. Natural language processing based
model that converts Turkish requirements into a Unified computer network terms using wordnet ontology
Modeling Language (BMD) class diagram to facilitate the creation. Süleyman Demirel Üniversitesi, Fen Bilimleri
work of the people in the software team and to reduce the Enstitüsü, 73, Isparta.
cost of software projects. Natural Language Processing [8] Albayrak, N.B., 2011. Doğal Dil İşleme Teknikleri
techniques were used in the study and a new set of rules Kullanarak Görüş ve Duygu Analizi. Fatih University,
containing twenty-six rules was created to find Object- Institute of Sciences and Engineering, Master Thesis, 69,
Oriented design elements from requirements. Since there is İstanbul.
no data set available to other researchers in the online
[9] Alkım, E., 2006. Türkçe için Doğal Dil İşlemede
repositories, a well-defined data set containing twenty
Biçimbirimsel Çözümleme ve Sözlük Tasarımı için Yeni
software requirements was created in Turkish and made Bir Yöntem. Dokuz Eylül University. Graduate School of
publicly available on GitHub for use by other researchers.
Natural and Applied Sciences, Master Thesis, 81, İzmir.
[10] Amasyalı, M.F., 2003. A Natural language processing
Ay (2019) was written using Python 3.7, a program to model
application running on internet : A question answering
natural language processing and production processes. In the
system, Yıldız Teknik Üniversitesi / Fen Bilimleri
text given as input, the program performs a preprocessing on
Enstitüsü, 66, İstanbul.
the text with text processing techniques. It then determines
which type of production is cell type, u type, workshop type [11] Ay, H., 2019. Manufacturing process modeling with
or mass production type, and accordingly models the process natural language processing. Eskişehir Teknik
by placing the existing machines. Then the modeled process Üniversitesi, Lisansüstü Eğitim Enstitüsü, 62, Eskişehir.
can be saved / printed or the simulation process can be [12] Ayata, D., 2018. Applying machine learning and natural
started via the modeled process. The aim of the study is to language processing techniques to twitter sentiment
provide an initial basis for a program that will provide all the classification for turkish and english. Boğaziçi
results with only audio narration without the need for any Üniversitesi, Fen Bilimleri Enstitüsü, 70, İstanbul.
other program for modeling and realization of production [13] Bahadır, Ö., 2007. Aritmetik Problemlerin
processes. Çözümlenmesi ve Simgelenmesi. Istanbul Technical
University, Institute of Science and Technology, Master
4. CONCLUSION Thesis, 66, İstanbul.
Natural language processing; It is an engineering field which [14] Birant, Ç.C., 2008. Root-Suffix Seperation of Turkish
is concerned with the design and implementation of Words. Dokuz Eylül University, Graduate School of
computer systems whose main function is to analyze, Natural and Applied Sciences, Master Thesis”, 56, İzmir.
understand, interpret and produce a natural language. In this [15] Bozyiğit, F., 2019. Object oriented analysis and source
study, to be valid and reliable documentation of the method code validation using natural language processing,
of scientific research thesis examined studies on natural Dokuz Eylül Üniversitesi / Fen Bilimleri Enstitüsü, 107,
language processing in Turkey were examined by scanning İzmir.
pages of the thesis of Higher Education. Evaluated in terms of [16] Çakıroğlu, Ü., 2001. Knowledge modelling by semantic
the subjects of the study samples obtained as a result of the networks in natural language processing. Karadeniz
literature review; Morphological analysis studies, syntactic Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 106.
analysis studies, semantic analysis studies and problem [17] Çelikkaya, G., 2015. Development of a turkish mobile
analysis criteria were determined and presented. assistant software using natural language processing
techniques, İstanbul Teknik Üniversitesi, Fen Bilimleri
REFERENCE Enstitüsü, 93, İstanbul.
[1] Adalı, Ş., 2009. An Integrated Architecture for [18] Delibaş, A., 2008. Doğal Dil İşleme ile Türkçe Yazım
Information Extraction from Documents in Turkish. Hatalarının Denetlenmesi. İstanbul Teknik Üniversitesi,
Istanbul Technical University, Instıtute of Science and Fen Bilimleri Enstitüsü, Yüksek Lisans Tezi, 78, İstanbul.
Technology, Doctorate Thesis, 109, İstanbul. [19] Eker, Ö., 2009. Parser Evaluation Using Textual
[2] Agun, H.V., 2008. Doğal Dil İşlemede Çizgisel ve Olasılık Entailments. BoğaziçiUniversity, Master Thesis, 107,
Tabanlı bir Otomatik Öğrenme Uygulaması. Trakya İstanbul.
Üniversitesi, Fen Bilimleri Enstitüsü, Master Tezi, 59, [20] Er, N.P., 2009. Turkish Factoid Question Answering
Edirne. Using Answer Pattern Matching. Bilkent University, The
[3] Akman, Ç.K., 2004. Computer-Aided Transcription Tool. Institute of Engineering and Sciences, Master Thesis,
Boğaziçi University, Graduate School of Natural and 155, Ankara.
Applied Sciences, Master Thesis, 75, İstanbul. [21] Eş, S., 2005. AComputational Analysis of a Language
[4] Aksoy, E., 2008. HPSG Teorisinin ve Semantik Frame Structure in Natural Language Text Processing. Çankaya
Teorisinin Bir Uygulaması Olarak, Sahne Betimleyen

@ IJTSRD | Unique Paper ID – IJTSRD29831 | Volume – 4 | Issue – 1 | November-December 2019 Page 1127
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
University, The Graduate School of Natural and Applied University. The Institute of Informatics, Master Thesis,
Sciences, Master Thesis, 52, Ankara 59, Ankara.
[22] Gökdeniz, E., 2016. Natural language processing for [40] Savaşçı, A., 2010. Türkçenin Bitişkenlik Derecesinin
mining neuroanatomical relations among brain regions. İstatistiksel Verilerle Belirlenmesi. Ege Üniversitesi, Fen
Boğaziçi Üniversitesi, Fen Bilimleri Enstitüsü, 81, Bilimleri Enstitüsü, Yüksek Lisans Tezi, 58, İzmir.
İstanbul. [41] Sevgili Ergüven, Ö., 2018. A systematic evaluation of
[23] Görmez, Z., 2009. Implementation of a Text-to-Speech semantic representations in natural language
System with Machine Learning Algorithms in Turkish. processing. İzmir Yüksek Teknoloji Enstitüsü,
Fatih University, Institute of Sciences and Engineering, Mühendislik ve Fen Bilimleri Enstitüsü, 83, İzmir.
Master Thesis, 63, İstanbul. [42] Shylov, M., 2008. Turkish and Turkmen Morphological
[24] Güngör, T., 1995. Computer Processing of Turkish: Analyzer and Machine Translation Program. Fatih
Morphological and Lexical Investigation. Boğaziçi University, Institute of Sciences and Engineering,
University, Doktorate Thesis, 185, İstanbul. Master Thesis, 46, İstanbul.
[25] Güvenç, B., 2016. Machine learning methods in natural [43] Soysal, E., 2010. Ontology Based Informatıon Extractıon
language processing. Boğaziçi Üniversitesi, Fen Bilimleri on Free Text Radıologıcal Reports Using Natural
Enstitüsü, 137, İstanbul. Language Processing Approach. The Middle East
[26] Hadımlı, K., 2011. Processing Turkish Radiology Technical University, The Institute of Informatics,
Reports. Middle East Technical University, The Doctorate Thesis, 110, Ankara.
Graduate School of Natural and Applied Sciences, [44] Şentürk, F., 2009. Biçimbirimsel Bul ve Değiştir. İstanbul
Master Thesis, 86, Ankara. Teknik Üniversitesi, Fen Bilimleri Enstitüsü, Yüksek
[27] Hallaç, Ü., 2007. Determination of Turkish Word Types. Lisans Tezi, 129, İstanbul.
Dokuz Eylül University, Graduate School of Natural and [45] Tahiroğlu, B. T., 2010. Bilgisayar Destekli Sözlük Bilimi
Applied Sciences. Master Thesis, 51, İzmir. Çalışmalarında Derleme Sözlüğü Veri Tabanı Örneği.
[28] Kalender, M., 2010. Automated Semantic Tagging of Çukurova Üniversitesi. Sosyal Bilimler Enstitüsü,
Text Documents. Boğaziçi University, Master Thesis, Doktora Tezi, 207, Adana.
107, İstanbul. [46] Turna, S. E., 2011. Sözcük Türlerinin Belirlenmesi için
[29] Karadeniz, Z.İ., 2007. Türkçe için Biçimbirimsel Türkçe Kelime Yapılarının Tanımlanması. Dokuz Eylül
Belirsizlik Giderici. İstanbul Teknik Üniversitesi. Fen University, Graduate School of Natural and Applied
Bilimleri Enstitüsü, Yüksek Lisans Tezi, 67, İstanbul. Sciences, Master Thesis, 38, İzmir.
[30] Kaya, S., 2018. Author-book recognition with natural [47] Tülek, M., 2007. Türkçe için Metin Özetleme. İstanbul
language processing techniques. İstanbul Aydın Teknik Üniversitesi, Fen Bilimleri Enstitüsü, Yüksek
Üniversitesi / Fen Bilimleri Enstitüsü, 89, İstanbul. Lisans Tezi, 88, İstanbul.
[31] Kazkılınç, S., 2012. Türkçe Metinlerin Etiketlenmesi. [48] Yalçınkaya, E., 2018. Natural language processing based
İstanbul Teknik Üniversitesi. Fen Bilimleri Enstitüsü, formal automated review tool for Turkish software
Yüksek Lisans Tezi, 63, İstanbul. requirements. Türk Hava Kurumu Üniversitesi / Fen
[32] Kazmı, M., 2017. Learning logic rules from text using Bilimleri Enstitüsü, 117.
statistical methods for natural language processing, [49] Yelmen, İ., 2016. Sentiment analysis with natural
Sabancı Üniversitesi / Mühendislik ve Fen Bilimleri language processing methods on Turkish social media
Enstitüsü, 97. data, İstanbul Aydın Üniversitesi, Fen Bilimleri
[33] Keklik, O., 2018. Automatic question generation using Enstitüsü, 91, İstanbul.
natural language processing techniques, İzmir Yüksek [50] Yıldırım, Ö., 2005. Improved Handling of Sms Messages
Teknoloji Enstitüsü, Mühendislik ve Fen Bilimleri with statistical Natural Language Processıng
Enstitüsü, 52, İzmir. Techniques. Boğaziçi University, Master Thesis,
[34] Kışla, T., 2009. Türkçe için Tümleşik bir Biçim 66,İstanbul.
Çözümleme ve Sözcük Türü Tespit Yöntemi. Ege [51] Yıldırım, A., Şimşek, H., 2000. Sayfa 140. Sosyal
Üniversitesi, Fen Bilimleri Enstitüsü, Doktora Tezi, 185, Bilimlerde Araştırma Yöntemleri. Ankara. Seçkin
İzmir. Yayıncılık.
[35] Köprü, S., 2008. Coupling Speech Recognition and Rule- [52] Yılmaz, E. Ç., 2008. A Turkish Natural Language
Based Machine Translation. The Middle East Technical Interface for the Semantic Web: A Case Study on
University, The Institute of Informatics, Doktorate Turkish Universities. Atılım University, Graduate School
Thesis, 130, Ankara. of Natural and Applied Sciences, Master Thesis, 42,
[36] Kutlu, M., 2010. Noun Phrase Chunker for Turkish Using Ankara.
Dependency Parser. Bilkent University, The Institute of [53] Yılmaz, S., 2009. Türkçe için İyileştiriliş Biçimbilimsel
Engineering and Sciences, Master Thesis, 124, Ankara. Çözümleyici. İstanbul Teknik Üniversitesi, Fen Bilimleri
[37] Orhan, Z., 2006. Türkçe Metinlerdeki Anlam Belirsizliği Enstitüsü, Yüksek Lisans Tezi, 55, İstanbul.
Olan Sözcüklerin Bilgisayar Algoritmaları ile Anlam [54] Yılmaz İnce, E., 2016. Web Ortamındaki Yazılı Sınavların
Belirginleştirmesi. İstanbul Teknik Üniversitesi. Fen Doğal Dil İşleme Yöntemleri İle Değerlendirilmesi,
Bilimleri Enstitüsü, Doktora Tezi, 124, İstanbul. Süleyman Demirel Üniversitesi Fen Bilimler Enstitüsü,
[38] Özdemir, V., 2009. Word Sense Disambiguation for Doktora Tezi, 117, Isparta.
Turkish Lexical Sample. Fatih University. Institute of [55] Zafer, H. R., 2011. A Generic Syntactic Parser for Turkic
Sciences and Engineering, Master Thesis, 63, İstanbul. Languages. Fatih University, Institute of Sciences and
[39] Per, M. U., 2011. Developing A Concept Extraction Engineering, Master Thesis, 59, İstanbul.
System for Turkish. The Middle East Technical

@ IJTSRD | Unique Paper ID – IJTSRD29831 | Volume – 4 | Issue – 1 | November-December 2019 Page 1128

You might also like