기말고사 연습문제 5-8장

Notice

Recent Posts

Recent Comments

Link

« 2024/12 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Tags more

Archives

Today

Total

관리 메뉴

사과하는 제라스

기말고사 연습문제 5-8장 본문

대학 전공 공부/데이터베이스1

기말고사 연습문제 5-8장

Xerath(제라스) 2022. 12. 12. 03:09

728x90

5.1 일반적으로 서로 다른 검색 키에 대해 동일한 관계에 있는 두 개의 군집화 지수를 갖는 것이 가능한가? 당신의 답을 설명하세요.

답변: 일반적으로 동일한 값을 함께 저장하려면 관계의 튜플이 서로 다른 순서로 저장되어야 하기 때문에 서로 다른 키에 대해 동일한 관계에 있는 두 개의 기본 인덱스를 가질 수 없습니다. 우리는 관계를 두 번 저장하고 모든 값을 복제함으로써 이것을 달성할 수 있지만, 이것은 데이터베이스 시스템에서는 단순히 허용되지 않는다.

5.11 When is it preferable to use a dense index rather than a sparse index? Explain your answer.

Answer: It is preferable to use a dense index instead of a sparse index when the file is not sorted on the indexed field (such as when the index is a secondary index) or when the index file is small compared to the size of memory.

5.11 희소 지수보다 밀도가 높은 지수를 사용하는 것이 선호되는 경우는? 당신의 답을 설명하세요.
답변: 인덱스 필드에 파일이 정렬되지 않거나(예: 인덱스가 보조 인덱스인 경우) 메모리 크기에 비해 인덱스 파일이 작을 때는 희소 인덱스 대신 밀도가 높은 인덱스를 사용하는 것이 좋습니다.

즉, Secondary indexing과 메모리 크기가 큰 거에 비해 index file이 작은 경우 dense indexing이 의미있음.

5.12 What is the difference between a clustering index and a secondary index?

Answer: The clustering index is on the field which specifies the sequential order of the file. There can be only one clustering index while there can be many secondary indices.

5.12 군집화 지수와 2차 지수의 차이점은 무엇입니까?
답변: 클러스터링 인덱스는 파일의 순서를 지정하는 필드에 있습니다. 클러스터링 인덱스는 하나만 있을 수 있지만 보조 인덱스는 여러 개 있을 수 있습니다.

clustering index = primary index 의 다른 표현

-> primary index는 하나의 파일만을 가리키는 반면, secondary index는 여러 file을 가리킬 수 있음.

5.13 Explain the distinction between closed and open hashing. Discuss the relative merits of each technique in database applications.

(Answer)
- Open hashing may place keys with the same hash function value in different buckets. Closed hashing always places such keys together in the same bucket.
- Deletion is difficult with open hashing as all the buckets may have to be inspected before we can ascertain that a key value has been deleted, whereas in closed hashing only the bucket whose address is obtained by hashing the key value need to be inspected.
- Deletions are more common in databases and hence closed hashing is more appropriate for them. For a small, static set of data lookups, it may be more efficient using open hashing. The symbol table of a compiler would be a good example.

5.13 닫힌 해싱과 열린 해싱의 구별을 설명한다. 데이터베이스 응용 프로그램에서 각 기술의 상대적 장점에 대해 논의합니다.
(정답)
- 개방형 해시는 동일한 해시 함수 값을 가진 키를 다른 버킷에 배치할 수 있습니다. 닫힌 해시는 항상 이러한 키를 동일한 버킷에 함께 배치합니다.
- 키 값이 삭제되었는지 확인하기 전에 모든 버킷을 검사해야 할 수 있기 때문에 열린 해시에서는 삭제가 어렵습니다. 반면 폐쇄 해시에서는 키 값을 해시하여 주소를 얻은 버킷만 검사해야 합니다.
- 삭제는 데이터베이스에서 더 일반적이므로 닫힌 해시가 더 적합합니다. 작고 정적인 데이터 룩업 세트의 경우 개방형 해시를 사용하는 것이 더 효율적일 수 있습니다. 컴파일러의 심볼 테이블이 좋은 예가 될 것이다.

5.14 Why is a hash structure not the best choice for a search key on which range queries are likely?

A range query cannot be answered efficiently using a hash index, we will have to read all the buckets. This is because key values in the range do not occupy consecutive locations in the buckets, they are distributed uniformly and randomly throughout all the buckets.

5.14 왜 해시 구조가 범위 쿼리가 가능한 검색 키에 대한 최선의 선택이 아닌가?

범위 쿼리는 해시 인덱스를 사용하여 효율적으로 응답할 수 없습니다. 모든 버킷을 읽어야 합니다. 이는 범위의 키 값이 버킷의 연속된 위치를 차지하지 않고 모든 버킷에 걸쳐 균일하고 랜덤하게 분포되기 때문입니다.

즉, 데이터들이 붙어서 저장되지 않고 이곳저곳에 있으면 시간이 오래걸림. 다 찾아야 하니까..!!

5.15 What are the causes of bucket overflow in a hash file organization? What can be done to reduce the occurrence of bucket overflows?

Answer: The causes of bucket overflow are:
a. Our estimate of the number of records that the relation will have was too low, and

hence the number of buckets allotted was not sufficient.
b. Skew in the distribution of records to buckets. This may happen either because there

are many records with the same search key value, or because the hash function

chosen did not have the desirable properties of uniformity and randomness. To reduce the occurrence of overflows, we can:

a. Choose the hash function more carefully, and make better estimates of the relation size.

b. If the estimated size of the relation is nr and number of records per block is fr, allocate (nr / fr)*(1+d) buckets instead of (nr / fr) buckets. Here d is a fudge factor, typically around 0.2. Some space is wasted: About 20 percent of the space in the buckets will be empty. But the benefit is that some of the skew is handled and the probability of overflow is reduced.

5.15 해시 파일 조직에서 버킷 오버플로의 원인은 무엇입니까? 버킷 오버플로 발생을 줄이기 위해 무엇을 할 수 있습니까?
답변: 버킷 오버플로의 원인은 다음과 같습니다.
a. 관계가 갖게 될 기록의 수에 대한 우리의 추정치는 너무 낮았고,
할당된 버킷 수가 충분하지 않았습니다.
b. 버킷에 대한 레코드 배포가 왜곡됩니다. 이것은 아마도 거기에 있기 때문에 발생할 수 있다.
동일한 검색 키 값을 가진 많은 레코드이거나 해시 함수 때문입니다.
선택한 값은 균일성과 무작위성이라는 바람직한 특성을 가지고 있지 않습니다. 오버플로 발생을 줄이기 위해 다음을 수행할 수 있습니다.
a. 해시 함수를 더 신중하게 선택하고 관계 크기를 더 잘 추정하십시오.
b. 관계의 추정 크기가 nr이고 블록당 레코드 수가 fr이면 (nr / fr) 버킷 대신 (nr / fr)*(1+d) 버킷을 할당합니다. 여기서 빨간색은 일반적으로 0.2 정도의 퍼지 계수입니다. 일부 공간이 낭비됩니다. 양동이 안에 있는 공간의 약 20%가 비어 있을 것이다. 그러나 일부 스큐가 처리되고 오버플로 가능성이 감소하는 이점이 있습니다.

5.16 Consider the following bitmap technique for tracking free space in a file. For each block in the file, two bits are maintained in the bitmap. If the block is between 0 and 30 percent full the bits are 00, between 30 and 60 percent the bits are 01, between 60 and 90 percent the bits are 10, and above 90 percent the bits are 11. Such bitmaps can be kept in memory even for quite large files.

a. Describe how to keep the bitmap up to date on record insertions and deletions.
b. Outline the benefit of the bitmap technique over free lists in searching for free space

and in updating free space information.

Answer:
a. Everytime a record is inserted/deleted, check if the usage of the block has changed levels. In

that case, update the corresponding bits. Note that we don’t need to access the bitmaps at all

unless the usage crosses a boundary, so in most of the cases there is no overhead.
b. When free space for a large record or a set of records is sought, then multiple free list

entries may have to be scanned before finding a proper size done, so overheads are much higher. With bitmaps, one page of bitmap can store free information for many pages, so I/O spent for finding free space is minimal. Similarly, when a whole block or a large part of it is deleted, bitmap technique is more convenient for updating free space information.

5.16 파일의 빈 공간을 추적하기 위해 다음 비트맵 기술을 고려하십시오. 파일의 각 블록에 대해 비트맵에서 두 비트가 유지됩니다. 블록이 0~30% 가득 차면 비트는 00, 30~60%는 01, 60~90%는 10, 90% 이상은 11이 됩니다. 이러한 비트맵은 꽤 큰 파일에서도 메모리에 보관될 수 있다.
a. 레코드 삽입 및 삭제 시 비트맵을 최신 상태로 유지하는 방법을 설명합니다.
b. 사용 가능한 공간을 검색할 때 사용 가능한 목록에 비해 비트맵 기술의 이점을 개략적으로 설명합니다.
사용 가능한 공간 정보를 업데이트합니다.
답변:
a. 레코드를 삽입/삭제할 때마다 블록의 사용 수준이 변경되었는지 확인합니다. 인
이 경우 해당 비트를 업데이트합니다. 비트맵에 액세스할 필요가 전혀 없습니다.
용도가 경계를 넘지 않는 한 대부분의 경우 오버헤드가 없습니다.
b. 대용량 레코드 또는 레코드 집합을 위한 사용 가능한 공간이 필요한 경우, 여러 개의 사용 가능한 목록
적절한 크기를 찾기 전에 항목을 스캔해야 할 수 있으므로 오버헤드가 훨씬 더 높습니다. 비트맵을 사용하면 한 페이지의 비트맵이 여러 페이지에 대한 무료 정보를 저장할 수 있으므로 사용 가능한 공간을 찾는 데 드는 I/O가 최소화됩니다. 마찬가지로 블록 전체 또는 블록의 상당 부분이 삭제될 때 비트맵 기술은 빈 공간 정보를 업데이트하는 데 더 편리합니다.

5.17 Indices speed query processing, but it is usually a bad idea to create indices on every attribute and every combination of attributes that is a potential search keys. Explain why. (You should mention at least two reasons)

(Answer)
- Every index requires additional CPU time and disk IO overhead during inserts and deletions
- Indices on non-primary keys might have to be changed on updates, although an index on the primary key might not (this is because updates typically do not modify the primary key attributes)
- Each extra index requires additional storage space.
- For queries which involve conditions on several search keys, efficiency might not be bad even if only some of the keys have indices on them. Database performance is improved less by adding indices when many indices already exist.

5.17 인덱스는 쿼리 처리 속도를 향상시키지만, 일반적으로 잠재적인 검색 키인 모든 속성과 속성의 모든 조합에 인덱스를 생성하는 것은 좋지 않습니다. 이유를 설명합니다. (적어도 두 가지 이유를 언급해야 합니다.)
(정답)
- 삽입 및 삭제 시 모든 인덱스에 추가 CPU 시간 및 디스크 IO 오버헤드가 필요
- 기본 키가 아닌 다른 키의 인덱스는 업데이트 시 변경해야 할 수 있지만 기본 키의 인덱스는 그렇지 않을 수 있습니다(업데이트가 일반적으로 기본 키 속성을 수정하지 않기 때문입니다).
- 인덱스를 추가할 때마다 추가 저장 공간이 필요합니다.
- 여러 검색 키에 대한 조건을 포함하는 쿼리의 경우 일부 키에만 인덱스가 있더라도 효율성은 나쁘지 않을 수 있습니다. 많은 인덱스가 이미 존재할 때 인덱스를 추가하면 데이터베이스 성능이 덜 향상됩니다.

8.1 A car-rental company maintains a database for all vehicles in the company. For all vehicles, it includes the vehicle identification number, license number, manufacturer, model, data of purchase and color. Special data are included for certain types of vehicles:

- trucks: cargo capacity
- sports cars: horsepower, rental age requirement
- vans: number of passengers
- off-road vehicles: ground clearance, drivetrain (four-wheeled, two-wheeled)

Construct an SQL schema definitions for this database. Use inheritance where appropriate. If necessary, add appropriate attributes.

모든 수단에는

1) identification num

2) license num

3) manufacturer

4) model

5) purchase data

6) color data

7) 특정 유형의 차에는 special data가 있음.

special data란?

- trucks: cargo capacity
- sports cars: horsepower, rental age requirement
- vans: number of passengers
- off-road vehicles: ground clearance, drivetrain (four-wheeled, two-wheeled)

이런 것들이 있다.

8.2 Consider the following relational schema.

- employee(personName, city)

- works(personName, companyName, salary)

- company(companyName, city)

- manager(personName, managerName)

(a) Give a schema definition for the above schema. You should use references to express foreign-key relationships.

(b) Write an SQL statement to employee names supervised by a manager who lives in Seoul.

(d) Write an SQL statement to find the company name with the most employees.

728x90

'대학 전공 공부 > 데이터베이스1' 카테고리의 다른 글

0. 시작하기 (0)	2022.09.15
11. 데이터베이스 설계 이론 (0)	2022.06.14
7. 오라클 실습 2 (0)	2022.06.14
6. 데이터베이스 시스템 주요 기능 (0)	2022.06.13
8. 응용 개발 (0)	2022.05.16

'대학 전공 공부/데이터베이스1' Related Articles

사과하는 제라스

기말고사 연습문제 5-8장 본문

기말고사 연습문제 5-8장

'대학 전공 공부 > 데이터베이스1' 카테고리의 다른 글

티스토리툴바