Class PersonPageData


  • public class PersonPageData
    extends java.lang.Object
    Class used to get and represent information about people related to Roman Emperors Dynasties.
    Author:
    Radu Ionut Barbalata
    See Also:
    Person, PersonNameUrl, PersonPageDataSerializer
    • Field Detail

      • personDynastyPageUrl

        private java.lang.String personDynastyPageUrl
      • imageUrl

        private java.lang.String imageUrl
      • role

        private java.lang.String role
      • birthDate

        private java.lang.String birthDate
      • deathDate

        private java.lang.String deathDate
      • reignBeginningDate

        private java.lang.String reignBeginningDate
      • reignEndDate

        private java.lang.String reignEndDate
      • adoptiveFatherNameUrl

        private PersonNameUrl adoptiveFatherNameUrl
      • successors

        private java.util.ArrayList<PersonNameUrl> successors
      • adoptedChildren

        private java.util.ArrayList<PersonNameUrl> adoptedChildren
      • urlPersonPageDataMatches

        private static java.util.HashMap<java.lang.String,​PersonPageData> urlPersonPageDataMatches
      • dynastiesPeopleList

        private static java.util.HashMap<java.lang.String,​java.util.HashMap<java.lang.String,​java.lang.String>> dynastiesPeopleList
      • months

        private static java.util.HashSet<java.lang.String> months
    • Constructor Detail

      • PersonPageData

        public PersonPageData​(java.lang.String personPageUrl,
                              org.json.simple.JSONObject serializedPersonPageData)
        Fulfill the fields of a PersonPageData object from a JSON object
        Parameters:
        personPageUrl - the person's Wikipedia page URL
        serializedPersonPageData - the JSONObject to deserialize data from
      • PersonPageData

        public PersonPageData​(org.openqa.selenium.WebDriver webDriver,
                              PersonNameUrl personNameUrl,
                              java.lang.String dynastyPageUrl)
        Fulfill the fields of a PersonPageData object with the information obtained during the scraping of a dynasty member's Wikipedia page
        Parameters:
        webDriver - the Web Driver instance to be used to scrape data
        personNameUrl - the person's PersonNameUrl object
        dynastyPageUrl - the Wikipedia page URL of the dynasty we're currently scraping on
    • Method Detail

      • getPersonPageData

        public static PersonPageData getPersonPageData​(org.openqa.selenium.WebDriver webDriver,
                                                       PersonNameUrl personNameUrl,
                                                       java.lang.String dynastyPageUrl)
        Construct a PersonPageData object or return it if it was already constructed
        Parameters:
        webDriver - the Web Driver instance to be used to scrape data
        personNameUrl - PersonNameUrl object of the person
        dynastyPageUrl - the dynasty's Wikipedia page url
        Returns:
        the constructed PersonPageData object
      • getCachedPersonPageData

        public static PersonPageData getCachedPersonPageData​(java.lang.String personPageUrl,
                                                             java.lang.String dynastyPageUrl)
        Return an already created PersonPageData object or null
        Parameters:
        personPageUrl - the person's Wikipedia page URL
        dynastyPageUrl - the dynasty's Wikipedia page URL
        Returns:
        PersonPageData object relative to the given URL
      • addToUrlPersonPageDataMatches

        public static void addToUrlPersonPageDataMatches​(java.lang.String url,
                                                         PersonPageData personPageData)
        Add a PersonPageData instance to the urlPersonPageDataMatches HashMap
        Parameters:
        url - the Wikipedia page URL to be used to later retrieve it
        personPageData - the PersonPageData instance
      • setUrlPersonPageDataMatches

        public static void setUrlPersonPageDataMatches​(java.util.HashMap<java.lang.String,​PersonPageData> urlPersonPageDataMatches)
        Replace urlPersonPageDataMatches with the given one. Used to replace all the PersonPageData stored instances with new ones (e.g. when importing data from JSON files)
        Parameters:
        urlPersonPageDataMatches - the new urlPersonPageDataMatches HashMap content
      • getUrlPersonPageDataMatches

        public static java.util.HashMap<java.lang.String,​PersonPageData> getUrlPersonPageDataMatches()
        Returns:
        the urlPersonPageDataMatches HashHap of Wikipedia page URL : PersonPageData entries
      • textImpliesEmperorRole

        public static boolean textImpliesEmperorRole​(java.lang.String textLine)
        Check if a given line of text contains something which implies the emperor role
        Parameters:
        textLine - the line of text to check
        Returns:
        true if the line implies the emperor role, false otherwise
      • textImpliesDictatorRole

        public static boolean textImpliesDictatorRole​(java.lang.String textLine)
        Check if a given line of text contains something which implies the dictator role
        Parameters:
        textLine - the line of text to check
        Returns:
        true if the line implies the dictator role, false otherwise
      • extractDates

        private static java.lang.StringBuilder extractDates​(java.lang.String[] possibleDates)
        Check for every element in the input if it can be part of a possible date, so the output contain numbers, months, a.C or d.C, and -> as well it contains particular cases like an or in the middle of two dates date1 or date2, and the output will be date1 or date2
        Parameters:
        possibleDates - a list of Strings
        Returns:
        StringBuilder object with the date
      • clearBrackets

        private static java.lang.StringBuilder clearBrackets​(java.lang.String information)
        Clean a given information string by removing brackets and characters inside
        Parameters:
        information - the given information string
        Returns:
        the cleaned result as a StringBuilder instance
      • getPersonNameUrls

        private static java.util.ArrayList<PersonNameUrl> getPersonNameUrls​(java.util.ArrayList<java.lang.String> peopleNames,
                                                                            org.openqa.selenium.WebElement informationDataElement)
        For each link contained in the informationDataElement we check if its text is also contained in the peopleNames ArrayList of strings and eventually add it to an output ArrayList if that's true
        Parameters:
        peopleNames - an ArrayList containing the people names
        informationDataElement - a WebElement containing the people anchor elements with their text and pointed page URL
        Returns:
        an ArrayList containing the PersonNameUrl(s) of all the people with a Wikipedia page URL
      • isEmperorOrDictator

        public boolean isEmperorOrDictator()
        Returns:
        true if the person's role is Emperor or Dictator, false otherwise
      • getPersonNameUrl

        public PersonNameUrl getPersonNameUrl()
        Returns:
        the PersonNameUrl instance related to this PersonPageData
      • getMotherNameUrl

        public PersonNameUrl getMotherNameUrl()
        Returns:
        the PersonNameUrl instance related to this person's mother
      • getFatherNameUrl

        public PersonNameUrl getFatherNameUrl()
        Returns:
        the PersonNameUrl instance related to this person's father
      • getAdoptiveFatherNameUrl

        public PersonNameUrl getAdoptiveFatherNameUrl()
        Returns:
        the PersonNameUrl instance related to this person's adoptive father
      • getSuccessors

        public java.util.ArrayList<PersonNameUrl> getSuccessors()
        Returns:
        an ArrayList containing the PersonNameUrl instance of each successor
      • getSpouses

        public java.util.ArrayList<PersonNameUrl> getSpouses()
        Returns:
        an ArrayList containing the PersonNameUrl instance of each spouse
      • getChildren

        public java.util.ArrayList<PersonNameUrl> getChildren()
        Returns:
        an ArrayList containing the PersonNameUrl instance of each child
      • getAdoptedChildren

        public java.util.ArrayList<PersonNameUrl> getAdoptedChildren()
        Returns:
        an ArrayList containing the PersonNameUrl instance of each adopted child
      • getBirthDate

        public java.lang.String getBirthDate()
        Returns:
        the person's birthdate
      • getDeathDate

        public java.lang.String getDeathDate()
        Returns:
        the person's death date
      • getReignBeginningDate

        public java.lang.String getReignBeginningDate()
        Returns:
        the person's reign beginning date. It may be null if it isn't an emperor or a dictator.
      • getReignEndDate

        public java.lang.String getReignEndDate()
        Returns:
        the person's reign end date. It may be null if it isn't an emperor or a dictator.
      • getPersonDynastyPageUrl

        public java.lang.String getPersonDynastyPageUrl()
        Returns:
        the Wikipedia page URL of the person's dynasty
      • getRole

        public java.lang.String getRole()
        Returns:
        the person's role (Emperor or Dictator)
      • getImageUrl

        public java.lang.String getImageUrl()
        Returns:
        the person's image URL
      • getDynastiesPeopleList

        public static java.util.HashMap<java.lang.String,​java.util.HashMap<java.lang.String,​java.lang.String>> getDynastiesPeopleList()
        Returns:
        an HashMap having as key the dynasties' Wikipedia page URLs and as value another HashMap containing the dynasty people with the name-birthdate as key and the Wikipedia page URL as value