pgloader will keep a separate file of rejected data, but continue trying to copy good data in your database.
pgloader also implements data reformatting, a typical example of that being the transformation of MySQL datestamps 0000-00-00 and 0000-00-00 00:00:00 to PostgreSQL NULL value
@Repository
public interface PostRepository extends BaseJpaRepository<Post, Long> {
@Query("""
select p
from Post p
where date(p.createdOn) >= :sinceDate
"""
)
@QueryHints(
@QueryHint(name = AvailableHints.HINT_FETCH_SIZE, value = "25")
)
Stream<Post> streamByCreatedOnSince(@Param("sinceDate") LocalDate sinceDate);
}
The FETCH_SIZE JPA query hint is necessary for PostgreSQL and MySQL to instruct the JDBC Driver to prefetch at most 25 records. Otherwise, the PostgreSQL and MySQL JDBC Drivers would prefetch all the query results prior to traversing the underlying ResultSet.
Postgres allows:
UPDATE dummy
SET customer=subquery.customer,
address=subquery.address,
partn=subquery.partn
FROM (SELECT address_id, customer, address, partn
FROM /* big hairy SQL */ ...) AS subquery
WHERE dummy.address_id=subquery.address_id;
This syntax is not standard SQL
Currently adding a column to a table with a non-NULL default results in
a rewrite of the table. For large tables this can be both expensive and
disruptive. This patch removes the need for the rewrite as long as the
default value is not volatile. The default expression is evaluated at
the time of the ALTER TABLE and the result stored in a new column
(attmissingval) in pg_attribute, and a new column (atthasmissing) is set
to true. Any existing row when fetched will be supplied with the
attmissingval. New rows will have the supplied value or the default and
so will never need the attmissingval.
CREATE TABLE books (
book_id serial NOT NULL,
data jsonb
);
INSERT INTO books VALUES (1, '{"title": "Sleeping Beauties", "genres": ["Fiction", "Thriller", "Horror"], "published": false}');
SELECT data->'title' AS title FROM books WHERE data->'published' = 'false';
CREATE INDEX idx_published ON books ((data->'published'));
select * from tbl TABLESAMPLE system (5)
The SYSTEM method is significantly faster than the BERNOULLI method when small sampling percentages are specified, but it may return a less-random sample of the table as a result of clustering effects.
SELECT * FROM pg_stat_activity WHERE state = 'active';
So you can identify the PID of the hanging query you want to terminate, run this:
SELECT pg_cancel_backend(PID);
-- show running queries (9.2)
SELECT pid, age(clock_timestamp(), query_start), usename, query
FROM pg_stat_activity
WHERE query != '<IDLE>' AND query NOT ILIKE '%pg_stat_activity%'
ORDER BY query_start desc;