数据结构*Set和Map类-EW帮帮网

Set和Map类

在这里插入图片描述
Map（映射）和Set（集合）是一种专门用来进行搜索的容器或者数据结构。
这两个类能快速判断元素是否存在(Set)或通过键快速查找对应值(Map)，其时间复杂度通常为O(1)（哈希实现）或者O(logN)（树实现）。
Set类的核心用途：快速判断元素是否存在。因为它的数据模型为纯Key模型。只关注关键字（key ）是否存在，不涉及对应值。
Map类的核心用途：通过键快速查找值。因为它的数据模型为Key-Value模型。存储关键字（key ）和对应的值（Value ）。
Map类中的HashMap和Set类中的HashSet是用哈希实现的；Map类中的TreeMap和Set类中的TreeSet是用树实现的。

Map常用方法：

方法	功能
V get(Object key)	返回key对应的value；key不存在返回null
V getOrDefault(Object key,V defaultValue)	返回key对应的value；key不存在返回defaultValue默认值
V put(K key,V value)	设置key对应的value
V remove(Object key)	删除key对应的映射关系
Set<K> keySet()	将所有的Key放在Set这个（不可重复）集合中，返回类型为Set<K>（关键字）
Collection<V> values()	将所有的value放在Collection这个(可重复)集合中，返回类型为Collection<V>（对应值）
boolean containsKey(Object key)	判断是否包含key
boolean containsValue(Object value)	判断是否包含value
Set<Map.Entry<K,V>> entrySet()	返回所有的key-value映射关系

public static void main(String[] args) {
    Map<String, Integer> map = new TreeMap<>();
    map.put("hello",10);//无效
    map.put("hello",30);//无效
    map.put("hello",20);
    map.put("world",20);
    System.out.println(map);
    System.out.println(map.get("hello"));
    System.out.println(map.get("hhhh"));
    map.remove("hello");
    System.out.println(map.getOrDefault("hello", 100));
    Set<String> set = map.keySet();
    System.out.println(set);
    System.out.println(map.containsKey("world"));
    System.out.println(map.containsValue(10));
    Set<Map.Entry<String, Integer>> entrySet = map.entrySet();
    System.out.println(entrySet);
}

注意：

1、Map.Entry<K,V>是Map类内部实现的用来存放<key,value>键值对映射关系的内部类，提供了获取<key,value>整体的方法等。

方法	解释
K getKey()	返回entry中的key
V getValue()	返回entry中的value
V setValue(V value)	修改键值对中Value的值

在Entry<K,V>中没有提供设置Key的方法

2、插入的Key一定要是可以比较的类型。要么实现Comparable接口，要么传入比较器。
3、Map中存放的Key是唯一的，但Value可以重复。只有最后一个重复的会被存储到Map中。
4、TreeMap中不能插入为null的Key，而HashMap可以。如果TreeMap中插入为null的Key，则抛异常NullPointerException。
5、Map中的键值对Key不能直接修改，要么就直接删除再添加。

Set常用方法：

方法	功能
boolean add(E e)	添加元素
boolean contains(Object o)	判断o是否在集合中
Iterator<E> iterator	返回迭代器
boolean remove(Object o)	删除set中的元素
boolean containsAll(Collection<?> c)	集合c中的元素是否在set中全部存在
boolean addAll(Collection<? enxtends E> c)	将集合c中的元素添加到set中
void clear()	清空集合
int size()	返回set中元素的个数
boolean isEmpty()	是否为空
Object[] toArray()	将set中的元素转换为数组返回

注意：

1、Set类只存储了Key，要求Key是唯一的！当添加数据有重复时，会自动的去重。

public static void main(String[] args) {
    Set<String> set = new TreeSet<>();
    set.add("world");
    set.add("hello");
    set.add("hello");
    System.out.println(set);//输出：[hello, world]
}

2、Set类继承了Collection类，所有可以定义比较器，来进行遍历set集合。

public static void main(String[] args) {
    Set<String> set = new TreeSet<>();
    set.add("world");
    set.add("hello");
    set.add("hello");
    System.out.println(set);//输出：[hello, world]
    Iterator<String> iterator = set.iterator();
    while (iterator.hasNext()) {
        System.out.print(iterator.next()+" ");
    }//输出：hello world 
}

3、通过源码发现TreeSet是使用Map类中的TreeMap实现的。其对应的值默认为Object插入到Map中。
在这里插入图片描述
4、TreeSet中不能插入为null的Key，而HashSet可以。如果TreeSet中插入为null的Key，则抛异常NullPointerException。

哈希表

也叫散列表。其核心原理是通过哈希函数把键（Key）映射到存储桶或槽位，从而快速查找数据。
哈希函数就是将Key存储到对应的哈希地址计算过程中使用的计算式。（一般我们使用Java自带的hash函数就可以了）
当两个Key的值不同，但通过哈希函数计算出来的存储的哈希地址相同就会产生冲突。
这时候我们需要去避免/减少冲突：1、设计合理的哈希函数 2、调节负载因子。
负载因子定义：α = 填入表中的元素个数 / 散列表的长度。当负载因子越大，冲突越大；当负载因子越小，冲突越小。为了降低负载因子，可以增加散列表的长度。

解决冲突

有两种方法：闭散列和开散列(哈希桶)
闭散列是采用数组的方式，通过哈希函数找到对应的哈希码，当发生冲突时，把Key存放到冲突的“下一个”位置。对于找“下一个”位置的方法有线性探测、二次探测等。
开散列是采用数组+链表+红黑树的方式。

实现一个哈希桶

使用数组+链表的方式来简单实现哈希桶

代码展示1

public class HashBuck {
    static class Node{
        public int key;
        public Integer value;
        public Node next;

        public Node(int key, int value) {
            this.key = key;
            this.value = value;
        }
    }
    public Node[] array;
    public int usedSize;
    public static final double LOAD_FACTOR = 0.75;//负载因子

    public HashBuck() {
        array = new Node[10];
    }
    public void push(int key,int value){
        int index = key % array.length;//哈希函数
        Node cur = array[index];//找到对应的哈希地址
        while (cur != null) {//遍历存储的节点，看是否存在重复的
            if(cur.key == key) {
                cur.value = value;
                return;
            }
            cur = cur.next;
        }
        //开始进行头插
        Node newNode = new Node(key,value);
        newNode.next = array[index];//新节点的下一个节点 为 旧节点
        array[index] = newNode;//array[index]指向新节点（这个数组单元存储首链表）
        usedSize++;
        //计算负载因子
        if(calcLOADFACTOR() >= LOAD_FACTOR) {
            //扩容
            //重新哈希一遍，因为有些数据的哈希地址发生变化
            resize();
        }
    }

    private void resize() {
        Node[] newArray = new Node[array.length*2];
        //重新哈希，开始遍历数组中的每个链表
        for (int i = 0; i < array.length; i++) {
            //遍历每个数组下标中是否存在链表，并重新放入新的数组中
            Node cur = array[i];
            while (cur != null) {//cur不为空说明这个下标有链表
                int newIndex = cur.key % newArray.length;
                //在新数组中进行头插法
                Node curN = cur.next;//curN用来标记cur的下一个节点
                cur.next = newArray[newIndex];
                newArray[newIndex] = cur;
                cur = curN;
            }
        }
        array = newArray;
    }

    private double calcLOADFACTOR() {
        return usedSize * 1.0 / array.length;
    }

    public Integer get(int key){
        int index = key % array.length;
        Node node = array[index];
        while (node != null) {
            if(node.key == key) {
                return node.value;
            }
            node = node.next;
        }
        return null;
    }
}

代码解释1

put方法中resize方法中重新哈希的大致图解
在这里插入图片描述
这时候我们会产生问题，当key为String类型的时候，这代码就会错误。当进行哈希函数计算时，字符串类型是不能进行%操作的。这时候可以利用泛型进行优化。其次，在引用类型比较内容的时候不能使用==来比较，而需要用equals方法。

代码展示2

public class HashBuckPlus<K,V> {
    static class Node<K,V>{
        public K key;
        public V value;
        public Node<K,V> next;

        public Node(K key, V value) {
            this.key = key;
            this.value = value;
        }
    }
    public Node<K,V>[] array;
    public int usedSize;
    public static final double LOAD_FACTOR = 0.75;//负载因子

    public HashBuckPlus() {
        array = (Node<K,V>[])new Node[10];
    }
    public void push(K key,V value){
        //将K类型转成int类型
        int hash = Math.abs(key.hashCode());//用Math.abs()方法是为了避免出现负数导致下标越界
        int index = hash % array.length;
        Node<K,V> cur = array[index];
        while (cur != null) {
            if(cur.key.equals(key)) {
                cur.value = value;
                return;
            }
            cur = cur.next;
        }
        Node<K,V> newNode = new Node<K,V>(key,value);
        newNode.next = array[index];
        array[index] = newNode;
        usedSize++;
        if(calcLOADFACTOR() >= LOAD_FACTOR) {
            resize();
        }
    }
    private void resize() {
        Node<K,V>[] newArray = (Node<K,V>[])new Node[array.length*2];
        for (int i = 0; i < array.length; i++) {
            Node<K,V> cur = array[i];
            while (cur != null) {
                int hash = Math.abs(cur.key.hashCode());
                int newIndex = hash % newArray.length;
                Node<K,V> curN = cur.next;
                cur.next = newArray[newIndex];
                newArray[newIndex] = cur;
                cur = curN;
            }
        }
        array = newArray;
    }

    private double calcLOADFACTOR() {
        return usedSize * 1.0 / array.length;
    }
    public V get(K key){
        int hash = Math.abs(key.hashCode());
        int index = hash % array.length;
        Node<K,V> node = array[index];
        while (node != null) {
            if(node.key.equals(key)) {
                return node.value;
            }
            node = node.next;
        }
        return null;
    }
}

代码解释2

代码中利用Object类中的 hashCode() 方法【public native int hashCode();】和 equals() 方法【public boolean equals(Object obj) {return (this == obj);}】
hashCode() 方法用于返回对象的哈希码（一个 int 类型的数值）。
在我们定义一个新的类时，每次new一个对象，其hashcode值都不一样。但有时候这两个对象之间逻辑上是一样的，我们希望其hashcode值要一样，这时候我们需要重写 hashCode() 方法和 equals() 方法。(利用IDEA自带的生成方法的工具快捷键)

eg.

public class Student {
    public int id;
    public String name;

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Student student = (Student) o;
        return id == student.id && Objects.equals(name, student.name);
    }

    @Override
    public int hashCode() {
        return Objects.hash(id, name);
    }

    public Student(int id, String name) {
        this.id = id;
        this.name = name;
    }
}

public static void main(String[] args) {
    Student student = new Student(1,"zhangssan");
    Student student1 = new Student(1,"zhangssan");
    Student student2 = new Student(2,"lisi");
    HashBuckPlus<Student,Integer> hashBuckPlus = new HashBuckPlus<>();
    hashBuckPlus.push(student,12);
    hashBuckPlus.push(student1,24);
    hashBuckPlus.push(student2,36);
    System.out.println(hashBuckPlus.get(student));//24
    System.out.println(hashBuckPlus.get(student1));//24
    System.out.println(hashBuckPlus.get(student2));//36
    System.out.println("----------------------------------");
    HashMap<Student,Integer> hashMap = new HashMap<>();
    hashMap.put(student,12);
    hashMap.put(student1,24);
    hashMap.put(student2,36);
    System.out.println(hashMap.get(student));//24
    System.out.println(hashMap.get(student1));//24
    System.out.println(hashMap.get(student2));//36
}

上面的Student类重写了 hashCode() 方法和 equals() 方法，使其能用于使用哈希表。

总结

1、哈希表依赖 hashCode() 快速定位，依赖 equals() 确认相等。对于一个自定义类在使用hash的时候需要重写 hashCode() 方法和 equals() 方法！
2、通常认为哈希表的插入/删除/查找的时间复杂度为O(1)。

数据结构*Set和Map类

Set和Map类

Map常用方法：

注意：

Set常用方法：

注意：

哈希表

解决冲突

实现一个哈希桶

代码展示1

代码解释1

代码展示2

代码解释2

总结

网站公告

今日签到

热门文章

最新发布